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Optimal  Structures  for  Multimedia  Instruction1 


Joseph  Goguen  and  Charlotte  Linde 
SRI  International  and  Structural  Semantics 


\1  Introduction 


This  report  describes  the  first  year’s  work  in  a  two  year  study  of  optimal  structures  for 
multimedia  instruction.  The  project  has  two  phases.  The  first  phase  elicits  experienced 
instructors’  explanations  of  a  demonstration  device,  in  order  to  obtain  for  analysis  a  significant 
range  of  the  possible  discourse  structures  that  occur  in  instruction.  The  outcome  of  this  phase 
is  a  set  of  variables,  and  a  set  of  hypotheses  about  relationships  among  them  that  lead  to 
effective  instruction.  The  second  phase  will  test  these  hypotheses  on  groups  of  students. 

'•V 

The  aim  of  this  project  is  to  provide  experimentally  validated  guidelines  both  for  the  design  of 
computer-based  instruction  generation  systems,  and  for  human  instruction  in  a  multimedia 
setting.  Potential  applications  for  this  research  include  multimedia  output  capability  (e.g., 
graphics  output  plus  audio,  using  speech  technology)  for  automatic  instructional  systems  and 
for  onboard  fault  diagnosis  systems,  as  well  as  the  improvement  of  traditional  classroom 
instruction. 

V  /7  , 

Four  major  results  have  been  achieved  so  far:  a  framework  for  discussing  optimal  discourse 

structures  and/or  visual  presentations  in  multimedia  instruction,  based  upon  the  notion  of  a 
mapping  between  semiotic  systems, '%^iscuss^..in--Section-'aCjl;  Jgj^the  discovery  that  the 
command  and  control  speech  act  chain  is  used  in  ■hands-on"  instruction  (our  structure  theory 
of  this  discourse  type  is  given^w^ppendfik-fi)}  jil^a  rich  set  of  experimental  hypotheses:  gisven 

J  * 

in  Section  6^jfid  (sfT&  demonstration  of  the  viability  of  a  methodology  combining  linguistic 
analysis  with  experimental  research. 


e  would  like  to  thank  Marshall  Farr  and  Henry  Halff,  of  the  Office  of  Naval  Research,  for  helping  to 
conceptualise  and  focus  this  project,  and  our  consultants  Tora  Bikson  and  James  Weiner  for  their  suggestions 
and  help  throughout  the  work. 
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1.1  Method 

This  research  concerns  instruction,  particularly  mixed  media  explanation,  involving  for 
example,  language,  diagrams,  and  demonstration  equipment.  The  approach  draws  from 
linguistics  (specifically,  discourse  analysis),  experimental  psychology,  and  philosophy 
(specifically,  semiotics).  The  purpose  of  this  subsection  is  to  provide  enough  information  on 
what  we  are  doing  so  that  the  reader  can  follow  the  explanations  and  examples  below.  Most 
of  this  subsection  concerns  the  Phase  I  experiments  already  completed.  See  Section  4.1  and 
Appendix  I  for  some  further  details,  especially  regarding  our  Phase  I  pilot  experiments  and  our 
plans  for  the  rest  of  the  project. 

1.1.1  Task 

Instructors  are  given  a  "logic  box"  having  four  lights  and  two  switches  (see  Figure  1).  Each 
light  realizes  a  different  logical  function  of  the  two  switches.  (Note  that  there  are  sixteen  such 
functions,  of  which  just  four  are  realized  in  the  actual  box.)  Instructors  are  also  given  a 
blackboard  with  colored  chalk.  Their  task  is  to  explain  to  students  how  to  use  the  logic  box; 
students  are  to  set  the  switches  so  as  to  achieve  some  given  configuration  of  lights.  After 
several  trials,  we  developed  the  following  approach:  Students  are  told  that  they  are  being 
trained  to  control  an  irrigation  system  producing  a  continuous  flow  of  fertilized  water,  and 
that  each  light  indicates  whether  or  not  a  certain  fertilizer  is  being  mixed  into  the  current 
product.  Their  job  will  be  to  set  the  switches,  upon  receiving  a  telephone  call  describing  the 
desired  mixture. 

The  explanations  elicited  from  instructors  in  this  way  are  then  subjected  to  formal  linguistic 
analysis,  to  identify  significant  variables  and  to  formulate  interesting  hypotheses  about  the 
relations  between  the  form  of  the  instruction  and  subsequent  performance  by  students.  The 
most  suitable  of  these  hypotheses  will  be  tested  in  Phase  II  to  determine  which  instructional 
structures  have  the  most  favorable  effects  on  learners’  performance. 

This  task  was  chosen  because  it  can  be  presented  by  instructors  using  a  wide  variety  of  media 
mixtures  (e.g.,  spoken  language  and  written  language;  charts,  equations  and  diagrams  on  the 
blackboard;  and  direct  use  of  the  demonstration  device)  at  several  different  cognitive  levels 
(including  concrete  operational  and  abstract  Boolean  algebra;  see  Section  4.1).  Moreover,  it  is 
fairly  easy  to  design  and  score  instruments  to  test  the  effectiveness  of  a  given  instructional 
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Figure  1:  The  Logic  Box 


technique  using  student  comprehension  as  a  dependent  measure.  Although  the  task  that  we 
have  chosen  may  seem  relatively  simple,  in  fact  instructors  exhibited  surprising  variability  in 
its  performance;  in  addition,  it  is  typical  of  subtasks  of  larger  instructional  tasks,  and  we 
believe  that  the  results  obtained  from  its  analysis  will  generalize  to  far  more  complex 
situations. 

1.1.2  Procedures 

Subjects  were  instructors  from  the  Engineering  Department  of  a  local  community  college. 
Before  the  instructional  session  began,  one  of  the  experimenters  presented  the  logic  box 
instruction  task  to  the  instructor;  the  logic  box  itself  was  explained  to  instructors  using  a 
circuit  diagram,  with  the  remark  that  this  would  probably  not  be  an  appropriate  explanation 
for  students.  This  leaves  the  instructor  free  to  determine  a  more  concrete  level  of  description 
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for  the  students.  In  the  first  two  experiments,  the  instructor  had  an  audience  of  community 
college  students  with  little  or  no  previous  exposure  to  engineering  subjects.  (This  population 
is  similar  in  background  and  age  to  most  novices  entering  technical  training  programs,  such  as 
in  the  Armed  Forces.)  The  experimenters  played  the  role  of  students  in  subsequent 
experiments,  since  the  students’  questions  proved  to  be  insufficiently  focussed  to  elicit  the 
desired  range  of  responses. 

Five  instructors  were  used  as  subjects  in  six  experiments.  This  series  of  experiments 
increasingly  refined  our  experimental  technique.  All  experiments  were  recorded  on  audio  tape 
and  then  transcribed,  yielding  a  total  of  124  pages  of  transcript  for  the  instructor  briefing  and 
subsequent  instructional  session.  (There  are  also  debriefings  for  students  and/or  instructors 
for  some  sessions.)  Each  such  session  lasted  between  one-half  and  one  hour.  The  first  two 
experiments  used  the  same  instructor,  and  also  used  groups  of  community  college  students  as 
an  audience,  4  students  in  the  first  and  5  in  the  second. 

1.1.3  Results 

Analysis  of  this  corpus  resulted  in  the  theory  given  in  Section  4,  the  variables  given  in  Section 
5,  and  the  hypotheses  given  in  Section  6.  In  addition,  we  discovered  that  our  theory  of  the 
command  and  control  speech  act  chain  [Structural  Semantics  83]  was  directly  applicable  to  the 
structure  of  hands-on  instruction;  see  Section  2.6  and  Appendix  II.  Finally,  we  became 
convinced  of  the  necessity  to  study  the  visual  component  (as  discussed  in  Section  1.2)  and 
were  inspired  to  begin  a  systematic  study  of  optimal  representations  based  on  semiotics  (see 
Section  3.) 

1.2  The  Visual  Component  of  Explanation 

Our  research  to  date,  and  indeed  most  research  on  explanation,  has  concentrated  on  the 
analysis  of  the  verbal  component,  since  this  appears  to  be  the  dominant  component, 
controlling  many  aspects  of  those  other  modalities  that  may  be  present.  However,  we  have 
now  found  that  it  is  necessary  to  analyze  the  visual  component  as  well  as  the  verbal.  This 
subsection  discusses  some  reasons  for  this,  and  some  probable  practical  results. 
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1.2.1  Indications  of  the  Necessity  for  Study  of  the  Visual  Modality 

We  had  initially  decided  not  to  use  video  in  these  experiments,  since  it  is  well  known  that 
video  analysis  is  a  difficult  and  lengthy  process.  However,  our  pilot  experiments  have  made  it 
clear  that  some  of  the  most  theoretically  interesting  and  practically  important  issues,  such  as 
the  interaction  between  linguistic  and  visual  modalities  and  the  reasons  why  some  visual 
materials  are  more  effective  than  others,  can  only  be  studied  using  video-taped  data. 

1.  The  Ubiquity  of  Diagrams.  In  our  instructions  to  instructors,  we  indicated  that  if 
they  wished,  they  could  use  the  blackboard.  All  five  instructors  made  significant  use  of 
both  the  blackboard  and  the  demonstration  device,  and  also  employed  a  variety  of 
referring  expressions  in  oral  explanations,  such  as  it,  this  one,  that  one  and  the 
bottom  case  (referring  to  the  last  row  of  a  table).  It  is  difficult  or  impossible  to 
understand  the  meaning  of  such  expressions  without  video. 

2.  Visual  Deixis.  One  of  the  most  important  problems  in  linguistic  theory  is  reference 
and  the  accomplishment  of  reference.  This  problem  becomes  additionally  complicated 
when  reference  can  be  accomplished  not  only  with  referring  expressions,  but  also  by 
means  of  visual  deixis  ~  pointing,  gaze  direction,  etc.  In  a  subject  domain  in  which  a 
considerable  amount  of  the  material  to  be  conveyed  is  in  visual  form,  visual  deixis 
becomes  so  important  a  form  of  reference  that  it  must  be  studied  in  order  to  understand 
the  mechanisms  of  communication. 

3.  Anomalies  in  the  Visual  Modality.  Our  current  research  has  revealed  a  number  of 
interesting  anomalies  or  errors  in  the  construction  of  diagrams.  One  such  anomaly  is  the 
case  of  a  subject  who  made  visual  reference  to  parts  of  the  diagram  by  pointing  to  its 
parts,  before  he  drew  the  diagram  on  the  board.  (Our  experience  of  lectures,  classroom 
explanations,  etc,  indicates  that  this  practice  is  more  common  than  might  be  believed.) 
Another  interesting  case  is  that  in  which  the  diagram  differs  from  the  verbal  explanation 
—  either  because  the  diagram  is  incorrect,  or  because  the  explanation  is  incorrect.  It 
seems  clear  that  these  unfortunately  fairly  common  types  of  error  must  have  a 
considerable  effect  on  learning.  In  addition  to  this  fairly  obvious  hypotheses,  it  would 
also  be  of  great  interest  to  see  whether  these  types  of  error  are  themselves  dependent  on 
some  other  variables  in  the  communication  situation. 

1.2.2  Practical  Reasons  to  Study  the  Visual  Modality 

In  addition  to  these  considerations  from  the  experiments  already  performed,  there  are  also 
strong  practical  reasons  for  studying  the  visual  modality.  The  first  is  that  the  relation 
between  the  verbal  material  and  the  visual  models  should  yield  a  number  of  hypotheses  which 
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would  be  easy  to  test,  and  which  would  be  rather  general  in  scope.  In  addition,  such 
hypotheses  should  lead  to  valuable  suggestions  for  training  in  multimedia  instruction,  since  it 
is  known  that  most  people,  including  most  instructors,  receive  no  training  in  the  production 
and  use  of  effective  diagrams,  visual  models,  etc. 

1.2.3  Theoretical  Basis  of  the  Study  of  the  Visual  Modality 

Linguistics  provides  a  wealth  of  techniques  for  the  analysis  of  spoken  language,  but 
unfortunately  there  is  no  comparable  body  of  theory  for  the  analysis  of  graphical  or  mixed 
media  data.  We  found  that,  while  semiotics  does  provide  a  good  starting  ;  t.  it  lacks  a 
theoretical  framework  for  describing  systematic  ways  of  representing  sign  rom  one  sign 
system  with  signs  from  another  system  (e.g.,  representing  states  of  the  logic  by  rows  of  a 

table  on  a  blackboard).  This  led  us  to  develop  the  notion  of  a  'semiotic  mor  ‘  described 
in  Section  3.4.  Now  that  this  theory  is  available,  we  are  able  to  design  experiments  especially 
suitable  for  collecting  video  data,  and  in  Phase  II,  we  will  perform  at  least  two  such 
experiments  before  moving  on  to  testing  hypotheses  about  which  representations  and 
structures  have  the  most  favorable  effects  on  learners’  performance.  The  experiments  of 
Phase  II  will  involve  showing  instructional  videotapes  constructed  according  to  specific 
principles  to  groups  of  experimental  subjects,  and  then  administering  a  single  post- 
instructional  instrument. 

1.3  Related  Research 

The  present  study  has  connections  with  many  other  projects  in  cognitive  science,  psychology, 
and  artificial  intelligence.  Part  of  our  analysis  is  to  understand  the  conceptual  structure  of  the 
task  and  the  task  domain.  This  is  similar  to  [Stevens  &  Steinberg  81,  Stevens  &  Collins 
80,  Gentner  &  Gentner  82],  which  provide  conceptual  models  for  complex  knowledge  domains 
and  the  possible  explanations  that  can  be  given  of  them.  Similarly,  [Kieras  82,  Kieras  & 
Bovair  83]  study  the  organization  of  knowledge  schemas  for  electronic  devices  and  the  effects 
of  different  mental  models  on  how  to  operate  such  devices.  An  early  important  study  in  this 
area  is  [Grosz  77],  which  established  the  hierarchical  nature  of  peoples’  knowledge  of  task 
domain  structure  for  problems  like  water  pump  repair. 

The  present  study  differs  from  these  preceding  studies  in  emphasizing  structural  analysis  of 
both  the  semantics  and  the  syntax  of  interaction,  particularly  explanation.  We  find  that 
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complex  communicative  situations  involving  multiple  semiotic  systems  cannot  be  analysed 
intuitively  or  on  the  basis  of  our  knowledge  as  members  of  the  culture;  thus  formal, 
theoretically-based  analysis  is  required. 

Many  additional  references  are  given  in  the  body  of  this  report  in  connection  with  specific 
topics  such  as  semiotics. 

2  Discourse  Analysis 

This  section  reviews  some  of  the  concepts,  theories,  and  techniques  from  linguistics  that  are 
used  to  analyze  the  explanations  elicited,  in  order  to  provide  specifiable  and  quantifiable  data 
for  further  research.  We  first  discuss  the  basic  notions  of  discourse  unit  and  discourse  type, 
and  the  kinds  of  rules  that  apply  to  them,  and  then  discuss  the  known  discourse  types  found 
in  our  data,  mentioning  some  particularities  that  these  discourse  types  exhibit  in  instructional 
discourse.  Many  of  our  examples  are  taken  from  the  study  of  aircrew 
communication  [Structural  Semantics  83)  since  the  complete  range  of  examples  is  not  yet 
available  for  instructional  discourse  at  the  present  stage  o.’  research. 

2.1  Discourse  Unit  and  Discourse  Type 

Discourse  analysis,  the  study  of  linguistic  units  larger  than  the  sentence,  is  used  in  this  study 
because  it  appears  that  the  discourse  unit,  rather  than  the  sentence,  the  word,  etc.,  is  the 
linguistic  level  of  greatest  significance  in  effective  instruction.  Specifically,  we  have  found  that 
reasoning,  pseudonarrative,  and  the  command  and  control  speech  act  chain  are  the  most 
relevant  discourse  types  in  our  data.  These  are  respectively  discussed  in  Sections  2.3.  2.4  and 
2.5  below. 

A  discourse  unit  is  a  segment  of  spoken  language,  composed  of  one  or  more  sentence,  having 
socially  recognized  initial  and  final  boundaries,  and  a  formally  definable  internal  structure. 
(This  definition  generalizes  the  criteria  given  by  [Labov  72]  for  the  narrative  of  personal 
experience.)  Other  discourse  units  that  have  been  studied  include  pseudonarratives, 
specifically  spatial  descriptions  [Linde  74,  Linde  <fc  Labov  75],  plans  [Linde  A-  Goguen  78] . 
jokes,  and  explanations  [Weiner  79,  Goguen,  Weiner  £  Linde  81].  It  is  rare  that  an  entire 
discourse  unit  consists  of  a  single  sentence;  more  often,  it  appears  as  a  several  sentences,  a 
question-answer  pair,  a  question-answer-evaluation  triple,  etc.  A  discourse  type  is  a  theory 
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of  the  structure  of  a  class  of  discourse  units,  that  is,  it  provides  a  way  of  recognizing  whether 
or  not  a  given  segment  of  language  is  an  instance  of  the  type.  Thus,  we  can  think  of  a 
discourse  type  as  the  class  of  discourse  units  that  satisfy  a  given  theory.  This  corresponds  to 
the  familiar  distinction  between  type  and  token. 

Discourse  analysis  studies  the  structure  of  discourse  types.  In  order  to  apply  it  to  the  question 
of  how  people  actually  use  discourse  units,  there  are  a  number  of  further  requirements  on  how 
the  research  should  be  conducted,  and  in  particular,  on  the  descriptions  to  be  used  for  the 
discourse  units  involved.  First,  the  work  must  based  upon  a  careful  empirical  analysis  of 
actual  human  discourse  in  natural  situations.  This  means  in  particular  that  we  cannot  use 
invented  examples  to  develop  our  theory  (although  such  examples  could  be  used  to  illustrate 
it).  Secondly,  it  is  necessary  to  have  a  mathematically  precise  description  of  the  discourse 
structures  of  interest.  Without  this,  we  cannot  properly  test  hypotheses  involving  variables 
that  refer  to  discourse  structure.  Third,  a  suitable  theory  must  also  provide  a  simple  and 
natural  taxonomy  of  the  parts  that  can  occur  in  a  given  type  of  discourse,  and  of  how  these 
parts  relate  to  one  another.  Each  of  the  discourse  types  that  has  been  studied  has  certain 
characteristic  parts,  and  also  certain  characteristic  relationships  of  subordination  among  these 
parts. 

For  example  in  reasoning,  one  statement  may  be  subordinate  to  another  statement  by  the 
relationship  of  providing  a  supporting  REASON,  as  in  the  following  example,  where  the 
second  statement  supports  the  first. 

(la)  If  your  memory  is  short,  the  best  thing  to  do  is 
to  construct  a  table 

(lb)  so  that  you  know  exactly  what  what  the  output  would 
be. 

Other  kinds  of  subordination  that  can  occur  in  reasoning  include  a  subordinate  statement 
serving  as  an  EXAMPLE  (i.e.,  an  instance)  of  a  statement,  and  several  statements  serving  in 
conjunction  or  disjunction,  supporting  the  same  statement.  There  is  also  ALT  subordination, 
indicating  that  two  subtrees  represent  alternate  possible  worlds. 

Such  an  organization  of  discourse  units  into  parts  that  are  connected  by  relationships  of 


subordination  is  easily  and  naturally  represented  by  a  tree  structure.  This  offers  a 
convenient,  graphically  suggestive,  and  mathematically  precise  way  to  represent  hierarchical 
subordination.  In  this  representation,  the  top  node  represents  the  whole  discourse,  and  its 
immediate  subordinates  represent  the  first  subdivision  into  parts.  For  example,  in  reasoning 
the  top  node  is  a  STATEMENT/REASON  node  indicating  a  division  into  two  major  parts, 
the  first  a  statement  of  the  proposition  to  be  established,  and  the  second  a  structure  of 
propositions  supporting  this  statement.  Labels  on  nodes  distinguish  the  different  kinds  of 
subordination  that  occur;  these  labels  are  called  subordinators.  Such  a  labelled  tree 
structure  closely  resembles  the  tree  structure  of  a  mathematical  proof  of  the  assertion  at  the 
STATEMENT  subtree  of  the  root. 

A  fourth  feature  of  discourse  that  an  adequate  theory  should  model  is  the  construction  of 
discourse  units  in  real  time.  For  this  purpose,  it  is  necessary  to  have  a  notion  of  the  present 
focus  of  attention,  in  order  to  be  able  to  indicate  to  what  previous  part  a  new  part  is  to  be 
subordinated,  as  discussed  in  the  next  subsection. 

2.2  Transformation  and  Focus  of  Attention 

The  real  time  aspect  of  discourse  is  especially  important  for  any  study  of  the  interactive  use  of 
language.  The  process  of  discourse  construction  is  modelled  by  transformations  on  the  tree 
structure  that  represents  the  discourse  structure.  Such  transformations  can  add,  delete,  or 
alter  a  discourse  part.  It  is  particularly  important  within  the  instructional  context  to  have  a 
formal  description  of  such  processes,  since  they  represent  important  variables  of  instructor- 
student  interaction  and  of  instructor  self-correction. 

For  example,  Figure  2  shows  the  transformation  that  constructs  a  tree  representing  a  text  of 
the  form  SI  since  S2  as  in  Example  ( 1  a- b )  above.  It  begins  with  Si,  If  jour  memory  is 
short,  whs  best  thing  to  do  is  to  construct  a  table  which  is  then  subordinated  by 
a  STMT/RSN  node  as  the  transformation  adds  the  statement  S2.  so  that  you  know 
exactly  what  what  the  output  would  be  supporting  Si. 

Transformation-  are  very  familiar  in  the  literature  of  linguistics  [Chomsky  65],  However,  they 
are  most  commonly  applied  to  the  structure  of  sentences,  rather  than  to  larger  discourse 
structures.  Also,  such  transformations  have  not  been  used  to  model  the  real  time  construction 


STVT/RSN 

,  ->  /  \ 

SI  S2 

Figure  2:  A  Simple  Transformation 

of  syntactic  structures,  but  rather  have  been  postulated  as  part  of  an  abstract  mechanism  for 
generating  syntactic  structures. 

The  focus  of  a  discourse  represents  the  presumed  focus  of  attention  of  the  participants  at  a 
given  point  in  a  discourse;  it  might  be  described  intuitively  as  ‘where  we  are  now.* 
Graphically,  we  represent  the  current  focus  as  a  at  a  particular  node  on  the  tree."  [Grosz 
77]  discusses  a  notion  of  focus  that  is  primarily  semantic  and  is  useful  for  resolving  pronoun 
references;  it  involves  a  hierarchical  structure  of  ‘focus  spaces*  that  is  similar  to  the  use  of 
embedded  pointers  in  our  theory. 

There  is  one  very  important  connection  between  focus  and  transformations,  a  constraint  on 
how  discourse  structure  can  be  built  up  in  real  time:  a  transformation  can  be  applied  only  at 
the  node  currently  in  focus.  This  constraint  on  the  application  of  transformations  corresponds 
to  speaker's  and  hearer’s  expectations  about  what  will  occur  next.  In  particular,  a 
transformation  cannot  be  applied  to  a  part  of  the  tree  developed  earlier  without  first  moving 
the  pointer  back  to  the  appropriate  subtree.  Some  transformations,  in  fact,  only  accomplish 
pointer  movement,  i.e.f  they  just  change  the  focus  of  attention,  and  thus  do  not  add  any 
semantic  content  to  the  tree. 

"Actually,  more  than  one  pointer  is  needed  for  some  transformations  [Goguen,  Weiner  &  Linde  8lj.  We  have 
found  constructions  in  explanation  much  like  those  called  'parallelism*  in  classical  rhetoric.  whrre  there  is  not 
only  an  active  node  of  focus,  but  also  a  passive  node;  in  these  constructions,  some  trarsformations  reverse  the 
active  and  passive  nodes,  so  that  addition  can  proceed  alternately  among  two  subtrees.  Markers  such  as  *on  the 
other  band*  are  used  to  switch  to  the  other  subtree  There  are  even  cases  where  more  than  two  pointers  are 
needed;  for  example,  if  one  parallel  construction  is  embedded  within  another.  However,  such  constructions  can 
be  quite  difficult  to  understand,  and  we  have  not  found  them  in  instructional  discourse. 
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This  general  theory  of  the  structure  of  discourse  types  is  the  basis  for  the  particular  theories  of 
reasoning,  pseudonarrative,  and  command  and  control  speech  act  chain  present  in  our  data. 
We  now  turn  to  the  first  of  these. 

2.3  Reasoning 

Reasoning  (called  explanation  in  some  previous  studies)  is  the  most  frequent  discourse  type 
found  in  instructional  discourse,  and  has  been  studied  previously  using  accounts  of  income  tax 
decisions,  career  choice,  and  the  probable  effect  of  taking  certain  political  decisions  [Weiner 
79,  Goguen,  Weiner  &  Linde  81]  as  data.  In  the  current  study,  we  call  this  discourse  type 
reasoning  and  reserve  the  term  explanation  for  a  broader  social  function.  This  social 
function  of  explanation  can  be  accomplished  by  many  discourse  types  including  reasoning, 
narrative,  pseudonarrative,  and  planning.  For  example,  a  question  like  Why  are  we  learning 
this?  might  be  answered  with  a  pseudonarrative  about  the  mixtures  of  fertilizer,  or  with  a 
reasoning  structure  to  show  that  a  correct  approach  is  being  taken.  Either  of  these  could 
function  as  an  explanation. 

Figure  3  shows  an  analysis  of  a  simple  instance  of  reasoning  from  the  domain  of  aircrew 
communication  in  which  the  flight  engineer  reports  his  justification  to  ground  control  of  the 
decision  not  to  recycle  the  landing  gear  after  an  initial  attempt  to  bring  the  landing  gear  down 
has  failed. 

The  most  important  relationship  of  subordination  in  reasoning  is  indicated  by  the 
STATEMENT/REASON  node.  In  the  reasoning  structure  displayed  in  Figure  3,  the  main 
STATEMENT  is  Don’t  recycle  the  gear.  Everything  that  follows  is  a  REASON 
supporting  this.  The  ALT  node  represents  the  speaker's  postulation  of  two  alternate  worlds, 
differing  by  whether  or  not  the  landing  gear  is  broken.  This  ALT  node  is  established  by  the 
underlined  portion  of  (2).  (The  number  in  parentheses  refers  to  the  time  in  the  flight  recorder 
transcription  of  the  United  Airlines  flight  173  crash  near  Portland  Oregon  on  December  28. 
1978;  this  convention  is  also  used  in  subsequent  examples  taken  from  the  same  flight.) 

(2)  . . .we’re  reluctant  to  recycle  the  gear  for  fear 

something  ie  bent  or  broken.  (1752:16) 

The  phrase  for  fear  indicates dadicate^both  the  uncertainty  about  whether  the  gear  is  bent, 
and  the  decision  to  treat  the  alternate  world  in  which  it  is  bent  as  the  one  on  which  attention 


■V-. 
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STATEMENT/REASON 

/  \ 


don't  recycle  ALT 

the  gear 


(not  bent  or  REASON/STATEMENT 
broken) 

OR  not  able 

to  get  it 
down 


/  \. 


bent  broken 


...  we're  reluctant  to  recycle  the  gear  for  fear  eomething 
ie  bent  or  broken,  and  we  won't  be  able  to  get  it  down 

(1761:16) 

Figure  3:  A  Reasoning  Tree 

is  focussed;  in  fact,  the  world  in  which  the  gear  is  not  bent  or  broken  is  only  implicit  in  the 
text  of  this  example. 

Figure  4  shows  the  node  types  found  in  reasoning,  including  EXAMPLE,  which  is  not  present 
in  the  example  of  Figure  3.  An  EXAMPLE  node  takes  as  its  subordinates  one  or  more 
examples  of  a  statement. 
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STATEMENT/REASON  AND  OR  NOT 

/  \  /l\  /|\ 

STATEMENT/CHALLENGE  IF/THEN  EXAMPLE  ALT 

/  \  /  \  /l.\  /\ 

Figure  4:  The  Subordinators  Found  in  Reasoning 

2.3.1  Summary  Nodes  in  Reasoning 

Although  we  found  that  the  theory  of  reasoning  structure  developed  in  [Goguen,  Weiner  & 
Linde  81]  applies  to  the  units  of  reasoning  in  the  current  dataset  of  instructional  discourse,  we 
also  made  one  addition  that  is  very  helpful.  This  is  a  new  branch  type,  called  SUMMARY, 
that  subordinates  capsule  descriptions  or  summaries;  the  symbol  27,  Greek  sigma,  is  used  to 
label  these  branches.  Nodes  that  involve  this  new  kind  of  branch  include  27/STATEMENT, 
STATEMENT/27,  27/STATEMENT/27,  STATEMENT/REASON/27,  and 
27/REASON/STATEMENT.  Some  hypotheses  concerning  the  placement  and  structure  of 
summary  branches  are  given  in  Section  6.  (3)  is  a  typical  example  of  reasoning,  and  contains 
several  summaries.  It  is  a  response  to  a  student’s  complaint  that  previously  given  explanations 
were  too  simple. 

(3}  You  can  get  a  little  more  complicated.  Like  you 
can  think  about  what  the  individual  lighte  do. 

One  of  the  interesting  things  with  this  one.  it 
doesn't  depend  upon  this  switch  at  all.  Like  to 
see  [inaudible]  Switch  Two.  So  if  it’s  off  when 

The  reader  unfamiliar  with  transcriptions  of  spoken  language  may  find  (3)  incomprehensible.  It  is.  however 
quite  characteristic  of  spoken  data.  It  is  immediately  comprehensible  when  beard  on  the  tape;  with  practice,  the 
written  version  of  such  data  becomes  familiar  and  accessible. 
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Switch  Two  iw  down,  it’*  oa  when  Switch  Two  i* 
up.  independent  of  what  happens  with  Switch  One. 

So.  Switch  Two  control*  light  C.  Yon  know,  let's 
see.  what's  another  one?  I  think  D  says  that  the 
two  switches  are  the  ease  ...  so  if  they're  both 
down  or  both  np.  D  i*  on.  bnt.  if  they’re 
different.  D  is  off.  Z  think  A.  if  I  remember 
right.  A  is.  is  th-.  no  B  is  that  they're 
both  down.  Uh.  in  any  other  position  B  is  off. 

And  then  there's  a  more  complicated  one.  What 
was  that?  If  A’*.  A  says  that  it's  not  anything 

other  than  One  np  or  Two  down,  then  it's  on. 

Then  that’s  off.  It  gets  a  little  bit  more 
complicated  when  yon  explain  it  that  way. 

The  structure  of  this  reasoning  unit  is  shown  in  Figure  5.  Notice  that  two  nodes  of  this 
structure  have  summary  branches;  the  top  (root)  node  actually  has  two  summary  branches, 
one  given  before  the  body  of  the  explanation  and  the  other  given  after  it.  The  body  of  this 
reasoning  unit  consists  of  an  AND  that  subordinates  four  reasoning  sub-trees,  one  for  each 
light  on  the  box.  A  summary  is  given  after  the  first  of  these  sub-trees.  It  is  interesting  to 

notice  that  the  order  in  which  the  lights  are  considered  is  here  not  their  physical  order  on  the 

box  (which  would  be  A,  B,  C,  D,  going  left  to  right);  rather,  the  lights  are  discussed  in  order  of 
increasing  psychological  complexity  (although  not  based  on  any  firm  evidence,  this  order 
appears  to  be  C,  D,  B,  A;  using  the  convention  that  Sn  is  the  predicate  "switch  n  is  up,"  C  is 
just  "S2",  D  is  "S1=S2",  B  is  "(not  Si)  and  (not  S 2),"  and  A  is  "SI  or  not  S2.").  Two 
semiotic  systems  that  are  involved  here  are  (1)  the  system  of  things  that  are  observable  about 
the  lights,  and  (2)  the  discourse  system  in  which  these  observations  are  explained.  This 
example  is  suggestive  for  Hypothesis  6  in  Section  6,  that  optimally  comprehensible 
explanations  do  not  preserve  relations  (such  as  ordering  by  complexity)  at  the  expense  of  basic 
constructors  (in  this  case,  the  physical  placement  of  lights).  Discourse  elements  are  ordered  by 
the  time  of  their  production;  and  the  lights  can  be  ordered  either  by  the  complexity  of  their 
logical  functions,  or  by  their  physical  placement.  The  explanation  would  have  been  more 
comprehensible  if  the  physical  ordering  had  been  used. 
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of  Swl  down  op  different 


Figure  5j  Structure  Tree  for  an  Instance  of  Reasoning 


2.4  Pseudonarrative 

In  addition  to  reasoning,  instructional  discourse  also  contains  a  number  of  instances  of 
pseudonarrative.  Pseudonarrative  is  a  discourse  type  having  some  but  not  all  of  the 
characteristics  of  spontaneous  oral  stories,  the  discourse  type  of  which  is  called  narrative. 
Like  narrative,  pseudonarrative  relies  on  the  narrative  presupposition,  the  rule  of 
interpretation  stating  that  the  order  of  main  clauses  is  to  be  taken  as  the  order  of  the  events 
that  they  describe.  Also  like  narrative,  The  pseudonarrative  type  permits  optional  initial 
summaries,  closing  evaluations,  and  end  markers.  The  difference  is  that  narrative  consists  of 
past  tense  main  clauses,  referring  to  actions  understood  as  actually  having  happened,  whereas 
pseudonsrrative  refers  to  hypothetical,  potential,  or  habitual  actions.  (4)  is  a 
pseudonarrative  from  the  data  of  the  current  study.  The  reader  may  be  note  a  lack  of 


16 


redundancy  in  this  example;  however,  some  redundancy  is  provided  by  the  preresence  of  a 
diagram  on  the  blackboard  summarizing  the  same  material. 

(A)  It'll  go  through  it  again  »lowly.  Q.K.  In  one 

poeition  of  these  two  twitches,  where  One  is  up  and 
Two  is  down,  all  these  lights  are  off.  Ve  take 
Switch  One  and  move  it  to  its  other  position,  three 
of  the  lights  cone  on.  Ve  reverse  this  combination 
and  make  them  both  go  up.  wo  got  again  three  of  the 
lights  are  on  but  a  different  three  lights.  And. 
if  we  move  em  till  both  down,  again,  we  got  three 
lights  on  but  these  two  lights  are  changed  over. 

Note  the  narrative  structure  imposed  by  the  use  of  the  personal  pronoun  we  and  the  active 
verbs  go.  take.  move,  make,  indicating  actions  rather  than  states. 

Pseudonarrative  has  previously  been  studied  in  the  domain  of  apartment  layout 
descriptions  [Linde  74,  Linde  &  Labov  75).  In  this  domain,  speakers  commonly  use 
pseudonarrative  structure  to  convert  spatial  organization  to  temporal  organization.  It  was 
found  that  they  used  a  spatial  organization  far  more  frequently  than  a  temporal  organization, 
and  made  far  fewer  errors  in  the  temporally  organized  descriptions,  suggesting  that  the 
pseudonarrative  organization  is  easier  to  produce  and  understand. 

In  the  present  domain,  pseudonarrative  offers  a  considerably  simpler  alternative  structure  to 
that  of  reasoning.  Structurally,  this  simplicity  is  reflected  in  the  fact  that  pseudonarrative  has 
a  broad  tree  structure  rather  than  a  deep  one;  i.e.,  it  has  fewer  complex  subtrees.  The  choice 
between  pseudonarrative  and  reasoning  yields  the  general  hypothesis  that  broad  trees  are 
more  comprehensible  than  deep  trees,  since  the  load  on  memory  is  less  [Yngve  60],  and  the 
specific  hypothesis  that  pseudonarrative  is  simpler  for  novices  than  reasoning  structure,  since 
the  discourse  organization  will  be  more  familiar. 
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2.5  Command  and  Control  Speech  Act  Chains 

The  command  and  control  speech  act  chain  provides  the  simplest  way  of  accomplishing  several 
important  forms  of  complex  social  action,  and  is  important  in  the  study  of  instruction 
discourse,  since  any  sequence  in  which  an  instructor  req  .-sts  a  student  to  do  anything  and 
then  receives  a  response  from  the  student  constitutes  a  command  and  control  speech  act  chain. 
Such  sequences  are  thus  the  basic  discourse  type  for  hands-on  instruction. 

We  define  a  speech  act  chain  to  be  a  maxmal  sequence  of  speech  acts,  each  of  which  has  the 
same  major  propositional  content.  (This  discussion  relies  on  [Searle  69]  in  its  use  of  the  terms 
■speech  act*  and  ‘propositional  content.*)  One  specific  form  of  speech  act  chain  constitutes 
the  command  and  control  speech  act  chain,  which  has  been  studed  as  the  basic  discourse  type 
for  aircrew  discourse  [Structural  Semantics  83],  Appendix  II  gives  a  technical  discussion  of 
such  command  and  control  speech  act  chains,  including  the  categories  of  utterance,  the 
subordinators  that  are  used,  and  the  rules  that  govern  sequencing. 

Example  (5)  illustrates  the  use  of  the  command  and  control  speech  act  chain  in  an 
instructional  context. 

(5a)  Instructor:  And  if  you  had  a  question,  now  you  could 
ask  a  question. 

(5b)  Student:  Um- 

(5c)  Instructor:  You  could  say  what,  what  kinda  controls 
do  I  have.  Vhat  can  I  do  with 
that  box. 

(5d)  Student:  How  coae  when  you’ve  got  both  of  the  switches 

off.  you  have  soae  lights  on? 

(5e)  Instructor:  Both  of  the  switches  off? 

(5f)  Student:  See  you’ve  got  thea  off. 

(5g)  Instructor:  Okay  well  that  isn’t  necessarily  off.  that’s 
just  down.  It.  aaybe  you  really  wouldn’t 
wanna  say  up  and  down  rather  than  on  and 
off.  Bight  be  a  better  way  of  saying  it. 

Does  that  aake  sense?  . . . 

(6h)  Student:  Uaa-haa.  Yeah. 


(GS.  p. 10) 


18 


In  this  example,  (5a)  and  (5c)  are  requests  for  action,  (5d)  and  (5e)  are  requests  for 
information,  (5f)  is  a  challenge,  (5g)  is  a  statement  followed  by  a  request  for  information,  and 
(5b )  is  an  acknowledgement  (italics  indicate  emphasis). 

The  study  of  speech  act  chains  in  an  instructional  context  is  of  general  interest  in 
understanding  classroom  discourse  [Griffin  &  Mehan  81,  Sinclair  &  Coulthard  75],  and  is  of 
particular  importance  in  the  understanding  of  hands-on  instruction. 

3  Semiotics 

This  section  presents  our  preliminary  investigations  into  semiotics.  The  present  project  is 
concerned  with  optimal  structures  for  multimedia  instruction;  semiotics  is  a  natural  theoretical 
framework  for  such  an  investigation,  since  it  attempts  a  general  theory  of  all  sign  systems, 
including  language,  diagram,  gesture,  etc.  As  this  work  is  inspired  by  our  experimental 
program  rather  than  by  philosophical  analysis,  it  has  a  preliminary  character,  and  we  expect 
that  there  will  be  reformulations  as  our  experimental  hypotheses  are  tested  and  refined. 

We  begin  with  a  short  introduction  to  semiotics  based  on  the  thought  of  Charles  Saunders 
Peirce,  followed  by  a  short  discussion  of  the  relation  between  semiotics  and  linguistics  that 
includes  a  review  of  Saussure,  a  founding  figure  with  Peirce  in  the  study  of  semiotics.  We 
next  formulate  precise  notions  of  ’semiotic  system’  and  ’sign  system.’  The  main  concept 
then  follows,  that  of  a  ’semiotic  morphism,  *  which  is  a  translation  from  signs  in  one  system  to 
signs  (i.e.,  representations)  in  another.  Finally,  we  consider  what  makes  some  translations 
better  than  others. 

These  ideas  should  be  applicable  to  many  aspects  of  instruction  as  well  as  to  other  areas  of 
communication;  some  instances  include  generating  appropriate  explanations,  determining  good 
representations  (’icons’)  for  computer  graphics,  measuring  the  quality  of  analogies,  and 
choosing  good  names  for  files  in  a  directory.  There  is  a  very  large  literature  that  is  relevant  to 
problems  like  these;  however,  all  studies  that  we  know  are  restricted  to  particular  a  semiotic 
system  (such  as  natural  language)  and/or  a  particular  semantic  domain,  or  else  lack  the 
precision  of  the  theoretical  model  that  we  will  present.  Some  recent  research,  however,  is 
fairly  close  to  ours  in  spirit;  in  particular,  the  last  two  problems  listed  above  have  been  studied 
by  [Gentner  83],  [Gentner  &  Gentner  82]  and  [Carroll  82],  respectively,  who  reach  conclusions 
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compatible  with  ours;  in  particular,  they  emphasize  the  importance  of  structure  as  opposed  to 
content.  However,  their  theoretical  framework  is  less  rich  than  ours,  which  includes 
hierarchical  levels  and  constructor  functions,  as  well  as  objects  and  relations. 

3.1  Peirce 

This  subsection  briefly  reviews  some  ideas  from  Peirce’s  approach  to  semiotics  [Peirce  65], 
since  our  approach  is  partially  based  upon  his  ideas.  We  try  to  use  Peirce's  original 
terminology  and  definitions  since  his  work  is  often  superior  to  later  popularizations  and 
extensions.  On  the  other  hand,  his  exposition  is  difficult  and  his  work  has  not  yet  been 
thoroughly  assimilated  into  current  philosophical  thought;  furthermore,  the  insights  of  modern 
linguistics  and  our  research  on  multimedia  instruction  have  suggested  certain  additions  and 
reformulations  that  we  do  not  attempt  to  distinguish  from  Peirce’s  original  concepts. 

A  basic  concept  in  Peirce’s  semiotics  is  semiosis,  an  instance  of  signification,  which  is  a 
situation  involving  the  following  three  main  components: 

1.  a  sign,  "something  which  stands  to  somebody  for  something  in  some  respect  or 
capacity;" 

2.  an  object,  that  for  which  the  sign  stands;  and 

3.  an  interpretant,  which  is  another  sign,  raised  by  the  original  sign  in  the  mind  of  the 
interpreter,  which  is  "directly  applicable  to  self-control,"  i.e.,  to  its  pragmatic  use. 

Peirce’s  original  terminology,  apparently  based  on  the  Medieval  tritium,  was  "pure 
grammar,"  "logic  proper,"  and  "pure  rhetoric."  These  concerned  respectively:  necessary 
conditions  for  meaningfulness,  necessary  conditions  for  truth,  and  "the  laws  by  which  ...  one 
sign  gives  birth  to  another,  and  especially  one  thought  brings  forth  another"  [Peirce  65]. 

Peirce  calls  the  "logical  interpretant"  of  a  sign  (as  opposed  to  its  "emotional"  interpretant) 
the  meaning  of  the  sign.  This  notion  suggests  the  modern  concepts  of  "knowledge 
representation;"  similarly,  his  "pure  rhetoric"  strikingly  resembles  the  concerns  of  modern 
■knowledge  engineering"  and  expert  systems.  The  following  quotation  might  almost  be  a 
modern  computer  scientist  (rather  than  a  "pragmaticist"4)  discussing  requirements  for  the 
knowledge  representation  system  of  a  robot: 


^Peirce  introduced  the  term  ’pragmaticism’  in  the  hope  that  it  would  be  so  awkward  that  no  one  would  copy 
it.  as  William  James  had  his  earlier  term  ’pragmatism." 
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The  rational  meaning  of  every  proposition  [which  is  a  well-formed  complex  of 
signs]  lies  in  the  future.  How  so?  The  meaning  [i.e.,  the  logical  interpretant]  of  a 
proposition  is  itself  a  proposition.  Indeed,  it  is  a  translation  of  it.  But  of  the 
myriads  of  forms  into  which  a  proposition  may  be  translated,  which  is  that  one 
which  is  to  be  called  its  very  meaning?  It  is,  according  to  the  pragmaticist,  that 
form  which  is  most  directly  applicable  tc  self-control  under  every  situation  and  to 
every  purpose. 

Certainly  Peirce  did  not  have  in  mind  the  fruits  of  modern  cognitive  and  computer  science, 
such  as  semantic  networks,  relational  databases,  non-monotonic  logic  and  rule-based  systems. 
But  these  appear  to  be  consistent  extensions  of  Peirce's  thought  in  the  direction  of  further 
precision,  applicability  and  effective  computability;  in  any  case,  we  shall  ourselves  move  in 
this  direction.  We  note  that  it  is  the  goal-directed  content  or  application  of  such  structures 
that  is  particularly  relevant  here. 

Some  further  comments  on  the  notions  of  sign  and  object  are  in  order.  Peirce's  ■objects*  are 
not  limited  to  physical  objects,  but  also  include  relations  and  properties  as  possible 
designations  for  signs.  In  considering  computer  generated  instruction,  we  shall  probably  also 
want  to  use  less  traditional  and  more  complex  entities  from  the  ontology  of  modern  computer 
science,  such  as  ■continuations*  and  procedures  or  algorithms. 

Peirce  had  a  good  deal  to  say  about  the  nature  of  signs,  much  of  it  very  relevant  to  our  study 
of  multimedia  instruction.  Let  us  first  consider  his  influential  threefold  division  of  signs  into 
icons,  symbols  and  indices,  according  to  the  manner  in  which  they  signify.  Peirce  defines  an 
icon  as  a  'sign  which  refers  to  the  Object  that  it  denotes  merely  by  virtue  of  characters  of  its 
own,*  *such  as  lead-pencil  streak  as  representing  a  geometrical  line.*  A  sign  x  is  an  index 
for  an  object  y  if  x  and  y  are  regularly  correlated,  in  the  sense  'that  always  or  usually  when 
there  is  an  x,  there  is  also  a  y  in  some  more  or  less  exactly  specifiable  spatiotemporal  relation 
to  the  x  in  question*  [Alston  67],  *Such,  for  instance,  is  a  piece  of  mould  with  a  bullet-hole  in 
it  as  sign  of  a  shot*  [Peirce  65].  Finally,  Peirce  defines  a  symbol  as  a  ‘sign  which  is 
constituted  a  sign  merely  or  mainly  by  the  fact  that  it  is  used  and  understood  as  such.* 

Peirce  did  not  finally  believe  that  the  best  interpretant  of  a  given  sign  is  necessarily  another 
sign.  The  argument  goes  as  follows:  Any  given  sign  can  always  be  'further  developed*  into 
another  sign  that  is  its  interpretant.  This  leads  to  an  infinite  sequence  of  signs.  If  at  some 
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point  the  next  interpretant  in  this  sequence  is  the  same  as  the  previous  one,  then  we  may 
regard  the  sequence  as  finite.  But  in  some  cases  it  is  genuinely  infinite,  and  there  is  no  final 
interpretant.  Instead.  Peirce  (sometimes)  took  what  he  called  habit,  the  'readiness  to  act  in 
a  certain  way  under  some  circumstances  and  when  actuated  by  a  given  motive*  as  'the 
veritable  and  final  logical  interpretant.'  Since  this  is  not  a  sign,  it  need  not  be  further 
interpreted.  Notice  that  this  formulation  is  also  quite  consistent  with  modern  procedural 
approaches  to  knowledge  representation. 

3.2  Semiotics  and  Linguistics 

Although  semiotics  is  intended  to  be  the  general  study  of  all  sign  systems,  almost  all  studies  of 
semiotics  have  taken  language  as  the  primary  semiotic  system.  There  are  several  reasons  for 
this.  First,  the  units  of  language  are  both  familiar  and  easy  to  discern.  Letters,  words,  and 
parts  of  speech  have  been  known  and  analyzed  as  formal  units  at  least  since  Aristotle,  and  the 
additional  units  added  by  modern  linguistics  are  relatively  well  agreed  upon.  In  contrast, 
kinesics,  or  body  language,  after  more  than  forty  years  of  study,  still  shows  no  agreement  on 
what  its  units  are,  or  whether  or  not  the  system  is  independent  of  parallel  activity  in  the 
linguistic  system.  Second,  of  known  semiotic  systems,  language  appears  to  have  the  greatest 
number  of  hierarchical  levels,  with  the  greatest  number  of  units  instantiating  each  level,  and 
also  the  fullest  body  of  theory  and  of  methods  for  studying  the  chosen  domain.  Therefore, 
many  studies  of  other  semiotic  systems  and  many  theoretical  formulations  of  semiotics  as  a 
meta-theory  take  linguistics  as  a  model.  For  example,  see  [Worth  &  Adair  73]  and  [Carroll  80] 
studying  the  semiotic  system  of  film,  and  [Barthes  57]  studying  cultural  systems,  such  as  the 
meaning  of  steak  in  French  cuisine. 

3.2.1  Saussure  and  Peirce 

Most  work  in  semiotics  derives  from  either  Saussure  or  Peirce.  A  major  difference  in  their 
approach  [Eco  79]  is  that  Saussure  gives  a  two  category  account  of  semiosis,  involving  only 
signs  and  meanings,  while  Peirce  gives  the  three  category  account  discussed  above.  Their 
work  also  differs  greatly  in  emphasis:  as  a  linguist  with  a  background  in  historical  linguistics. 
Saussure  focusses  on  the  paradigmatic  relations  of  signs  within  sign  systems;  while  Peirce,  as  a 
logician  and  philosopher  interested  in  pragmatics,  considers  signification,  interpretation,  and 
the  combination  of  signs  in  syntagmatic  relations.  A  paradigmatic  organization  of  signs  is 
composed  of  all  the  signs  of  a  given  class  in  relation  to  one  another.  Paradigmatic  sign 


systems  include,  for  example,  the  paradigm  consisting  of  all  case  markings  possible  for  a  noun 
of  a  given  class,  the  paradigm  of  all  relative  pronouns,  and  the  paradigm  of  all  Boolean 
functions.  In  contrast,  the  syntagmatic  plane  of  organization  consists  of  the  actual  order  of 
signs  as  they  are  used  in  real  time,  for  example,  the  order  and  syntactic  organization  of  words 
in  a  sentence,  or  of  symbols  in  a  logical  proof.  Clearly,  both  planes  of  organization  are 
necessary  to  understand  a  sign  system. 

Perhaps  the  most  characteristic  of  Saussures’s  contributions  to  the  study  of  semiotics  is  an 
emphasis  on  and  extension  of  the  notion  of  the  arbitariness  of  the  sign  [Saussure  74], 
Traditionally  in  linguistics,  arbitrariness  has  meant  the  absence  of  a  necessary  connection 
between  sound  and  meaning.  That  is,  the  sound  sequences  'arrow'  or  'fleche'  or  'pfeil'  can  all 
express  the  meaning  there  is  no  necessary  connection  between  the  sound  and  the 
meaning.  However,  Saussure’s  work  shows  that  although  the  sound/meaning  connection  is 
arbitrary  (with  the  minor  exception  of  onomatopoiea),  the  range  of  meaning  of  a  given  sign  is 
not  arbitrary,  but  rather  is  constrained  by  the  meanings  of  all  related  words.  For  example, 
the  set  of  color  terms  in  a  given  language  forms  a  mutually  constraining  semiotic  system 
within  the  lexicon.  Therefore,  it  is  not  possible  to  give  the  meaning  of  an  individual  color 
term  in  isolation.  Thus  we  cannot  understand  the  full  range  of  the  English  term  'brown'  from 
just  a  definition  and  examples  of  brown.  We  must  also  know  what  it  is  not ,  and  so  must  also 
understand  red,  black,  yellow,  and  all  the  other  terms  in  the  same  system  with  which  brown 
contrasts. 

These  notions  of  the  arbitrariness  and  mutual  constraint  of  paradigm  members  will  become 
important  when  we  study  particular  multi-media  semiotic  systems,  since  they  can  help  to 
answer  questions  about  the  boundaries  of  a  given  system,  their  degree  of  arbitrariness,  etc. 

3.3  Semiotic  Systems 

This  subsection  gives  our  notion  of  semiotic  system,  inspired  by  Peirce's  formulation  of 
semiosis;  it  will  be  seen  that  most  of  our  discussion  concerns  the  structure  of  complex  signs. 
Our  exposition  is  gradual,  and  is  finally  summarized  by  a  reasonably  precise  definition  of 
semiotic  system.  Some  formalization  of  the  structure  of  a  system  of  related  signs  is  needed  in 
order  to  study  what  makes  one  representation  of  a  given  sign  better  than  another.  This  is 
because  it  is  necessary  to  consider  how  related  (but  significantly  different)  signs  will  be 
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represented  in  order  to  avoid  confusion  and  ambiguity;  and  it  is  necessary  to  consider  what 
attributes  of  signs  should  be  given  priority  in  constructing  their  representations.  (These  issues 
of  representation  are  not  explicitly  addressed  until  Section  3.4.) 

In  all  but  the  simplest  sign  systems,  individual  signs  are  organized  into  compound  signs;  for 
example,  sentences  are  made  of  words.  This  is  a  fundamental  strategy  for  rendering  the 
complexity  of  non-trivial  communication  more  manageable.  One  may  iterate  this  strategy,  by 
regarding  complex  signs  at  one  level  of  analysis  as  individual  signs  at  a  higher  level,  and  then 
forming  compound  signs  from  these  as  well,  which  leads  to  a  multilevel  hierarchy  of  sign 
structure.  For  example,  linguistics  recognizes  the  following  levels:  phonological  (the  sounds  of 
a  given  language);  morphological  (the  smallest  repeated  compounds  of  phonemes  with  a  stable 
meaning);  lexical  (words);  syntactic  (phrases  and  sentences);  and  discourse  (multisentential) 
units.  It  is  important  to  note  that  this  is  a  •whole/part*  hierarchy,  in  which  items  at  each 
level  are  composed  from  components  from  the  next  lower  level.  Such  a  hierarchy  is  therefore 
quite  different  from  Peirce’s  three-fold  division  of  semiosis,  which  focusses  on  the  meaning  of 
signs  at  a  given  level. 

Sometimes  there  may  be  one  or  more  basic  levels,  which  are  somehow  most  important  or 
characteristic  of  a  given  semiotic  system.  In  the  case  of  natural  language,  for  the  last  thwnty 
five  years,  it  has  been  supposed  that  the  sentential  level  is  basic,  although  more  recent 
research  is  inclined  to  regard  the  discourse  level  as  at  least  as  important.  More  generally, 
there  may  be  a  partial  ordering  relation  upon  the  levels  of  a  semiotic  system,  such  that  some 
levels  are  more  basic  than  others. 

The  whole/part  hierarchical  organization  of  complex  signs  requires  that  signs  be  considered 
not  only  individually  but  in  their  context.  The  immediate  context  of  a  sign  consists  of  those 
other  signs  that  surround  it,  in  space  and/or  time,  and  that  together  with  it  form  a  complex 
sign  at  the  next  higher  level.  In  numerous  linguistic  studies,  it  has  been  found  that  the 
context  and  speaker  of  a  given  sentence  in  a  story  are  at  least  as  important  for  determining  its 
meaning  as  are  the  words  that  comprise  it.  (For  an  extreme  example,  the  sentence  •Yes* 
could  mean  almost  anything  if  given  an  appropriate  context.)  Generalizing  this,  we  may  say 
that  it  is  more  useful  to  view  meaning  as  being  produced  •top-down*  than  •bottom-up.*  (An 
example  of  the  utilitiy  of  this  view  is  found  in  artificial  intelligence  research,  where  contextual 
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cues  have  been  found  to  be  essential  in  recognizing  and  disambiguating  signs;  this  has 
particularly  been  the  true  for  speech  understanding  and  machine  vision  projects.) 

It  is  a  common  approach  to  take  individual  signs  as  the  basic  meaning  bearing  units;  however, 
Peirce  gave  that  role  to  propositions,  which  are  well-formed  complexes  of  individual  signs5. 
This  has  the  important  advantage  that  the  well-known  dependence  of  sign  meaning  upon  the 
context  in  which  the  sign  occurs  is  not  a  strange  phenomenon  that  needs  to  be  explained,  but 
instead  follows  directly  from  the  way  that  things  are  defined,  since  meaning  lies  in  the  context 
rather  than  in  the  individual  sign.  (Notice  that  this  phenomenon  can  be  iterated  over  several 
hierarchical  levels  of  sign  structure.) 

An  important  aspect  of  semiosis  is  the  particular  sensory  means  by  which  a  sign  is  expressed. 
Possible  senses  include  visual,  auditory,  and  kinesic.  For  convenience,  we  will  also  include 
mental  events  as  sensory  events;  some  such  move  is  clearly  needed  to  handle  many  important 
examples,  such  as  inferring  a  proposition  from  one  or  more  others.  Of  course,  a  sign  may 
involve  more  than  one  sensory  modality.  For  example,  a  telephone  conversation  involves  the 
auditory  modality,  while  a  television  program  involves  an  audio-visual  mix. 

For  a  given  sensory  mix,  a  very  large  number  of  different  signs  may  be  possible;  and  it  may 
also  be  possible  to  organize  these  signs  (or  subsets  of  them)  in  a  wide  variety  of  different  ways. 
Within  a  given  sensory  mix,  a  given  choice  of  signs  and  way  of  organizing  them  may 
characterize  a  particular  semiotic  system  (there  are,  of  course,  other  factors,  such  as  the 
objects  and  interpretants  involved).  Notice  that  a  sign  that  is  meaningful  in  one  semiotic 
system  may  not  be  in  another.  For  example,  different  alphabets  (such  as  Roman,  Greek  and 
Cyrillic)  involve  different  letters,  although  a  given  form  for  a  letter  may  be  used  in  more  than 
one  alphabet. 

3.3.1  A  Formalization  of  Semiotic  Systems 

Having  considered  the  three  aspects  of  semiosis,  and  the  whole/part  hierarchy  of  levels,  we 
now  consider  the  structure  of  entities  at  a  given  level.  Some  of  this  discussion  may  be  familiar 
from  formalisms  used  in  linguistics,  but  our  purpose  here  is  to  give  a  formalism  that  is 
applicable  to  any  semiotic  system  whatever. 

5Of  course,  a  proposition  is  also  a  sign;  and  in  some  cases,  an  individual  sign  is  also  a  proposition. 
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We  have  already  noted  that  entities  at  level  n  are  constructed  from  entities  at  level  n-1  (and 
other  entities  at  level  n  that  are  already  so  constructed).  A  given  semiotic  system  admits  only 
a  certain  limited  number  of  ways  to  put  parts  together  at  a  given  level  n;  we  will  refer  to 
these  as  its  constructors  at  level  n.  In  general,  there  is  a  classification  scheme  for  the  entities 
at  a  given  level  (e  g.,  the  parts  of  speech  are  such  a  scheme  for  the  syntactic  level  of  a  natural 
language  semiotic  system),  and  the  constructors  at  that  level  can  be  seen  as  rules  for 
combining  entities  of  these  various  classes  to  get  a  new  entity  of  another  certain  class.  Such 
rules  may  be  written  in  the  form 

r:  <cl>...<cn>  -*  <c>  , 

where  <c>  is  the  result  class,  r  is  the  constructor,  and  <el>,...,<cn>  are  the  classes  of  the 
parts  that  r  puts  together.  Thus,  e=r(el,...,en)  is  the  entity  (of  class  <c>)  resulting  from 
applying  r  to  entities  el,...,en  of  classes  <cl>,...,<cn>,  respectively.  (Incidentally,  the 
<ci>  may  be  classes  of  entities  either  from  level  n  or  level  n-1.) 

A  familiar  special  case  is  that  of  a  formal  context-free  grammar,  where  each  <ci>  is  a 
syntactic  class  (or  "part  of  speech*),  the  entities  are  words,  and  each  rule  r  is  of  the  form 
e=wO  el  wl  e2  w2  ...  en  wn  , 

where  wO,...,wn  are  fixed  words  (or  strings  of  words,  possibly  the  empty  string);  this  is  more 
conventionally  written  in  the  form 

<c>  -♦  wO  <cl>  wl  <c2>  w2...<cn>  wn  . 

A  familiar  special  case  of  such  a  string  is 
S  -  NP  VP 

for  which  <c>=S,  n=2,  w0=wl=w2=  the  empty  string,  <cl>=NP  and  <c2>=VP. 

However,  a  grammatical  formalism  that  is  based  on  strings  cannot  be  used  conveniently  for 
applications  like  two-dimensional  graphical  displays,  or  multi-media  presentations  such  as 
audio-visual  animations.  This  is  because  string  formalisms  are  not  only  inherently  one 
dimensional6  but  are  also  inherently  limited  to  discrete  phenomena,  as  opposed  to  phenomena 
that  are  more  naturally  viewed  as  involving  continuous  variables.  Some  examples  of 
continuous  variables  would  be  pitch  and  volume  in  an  auditory  semiotic  system,  or  size  and 
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Although  two  or  more  dimensions  can  in  principle  be  encoded  in  such  a  formalism,  it  is  unnatural  and 


inconvenient  to  do  so,  since  no  special  formal  support  is  provided  for  this. 
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placement  in  a  graphics  system.  This  is  why  we  have  chosen  the  more  general  functional 
notation  e=r(el,...,en). 

Three  slight  additions  to  this  basic  formalism  seem  useful;  there  may  well  be  others  that  we 
have  not  discovered.  First,  a  given  constructor  (or  rule)  r  may  have,  in  addition  to  its  formal 
arguments  el....en  which  are  entities,  some  number  of  parameters  pl,...pk,  chosen  from  fixed 
sets  of  parameter  values  <pl>,...,<pk>.  Thus,  we  might  write 
r:  <pl>,...,<pk>,<el>,...,<cn>  —  <c> 

and 

e  =  rpi . pk(el,.-,en). 

For  an  example  of  parameters,  consider  th°  location  of  the  upper-lefthand  corner,  and  the  size 
of  a  graphic  entity,  say  a  cat,  to  be  displayed  on  a  graphics  terminal;  depending  on  the  values 
of  these  parameters,  the  cat  will  have  a  different  location  and  size,  but  will  still  be 
recognizably  the  same  cat. 

The  second  addition  is  that  there  may  be  a  priority  ordering  on  these  constructor  functions. 
Under  such  an  ordering,  there  may  be  a  primary  constructor,  which  has  greater  priority 
than  any  other  constructor;  there  may  also  be  one  or  more  secondary  constructors,  each 
having  less  priority  than  the  primary  constructor  and  greater  priority  than  any  non-primary 
and  non-secondary  constructor,  with  none  of  these  having  priority  over  any  other;  similarly, 
there  may  be  one  or  more  tertiary  constructors,  etc.  Notice  that  what  we  have  here  is  a 
partial  ordering  rather  than  a  total  ordering,  since  given  two  distinct  constructors  rl  and  r2, 
it  is  not  necessary  that  either  one  has  priority  over  the  other. 

The  level  of  discourse  types  in  the  English  natural  language  semiotic  system  provides  some 
nice  examples  of  primary  constructors.  For  example,  explanation  [Goguen,  Weiner  &  Linde 
81]  has  a  primary  constructor,  AND,  which  serves  to  conjoin  a  number  of  reasons  for  the  same 
statement.  The  argument  that  AND  is  a  primary  constructor  is  simply  that  it  is  so  basic  to 
the  explanation  discourse  type  that  explicit  textual  markers  for  it  can  often  be  omitted 
without  obscuring  the  meaning.  Several  other  discourse  types  are  also  known  to  have  such 
•default*  constructors  (Linde  &  Goguen  80];  thus,  these  also  have  primary  constructors. 


The  third  addition  is  that  of  context  conditions7  on  rules.  These  are  conditions  (i.e.. 
predicates)  that  limit  the  applicability  of  a  rule  to  certain  particular  contexts,  namely  those 
where  the  predicate  is  true;  these  predicates  may  involve  the  arguments  el,...,en  and  also  the 
parameters  pl,...,pk.  Context  conditions  often  express  constraints  that  arise  as  a  result  of 
structure  at  higher  levels  of  the  hierarchy,  and  are  a  feature  of  many  recent  grammatical 
formalisms  [Kaplan  £  Bresnan  82],  [Wasow  et  al.  82]. 

The  predicates  that  can  be  used  for  expressing  constraints  are  also  a  significant  part  of  a 
semiotic  system.  At  a  given  level,  there  will  be  only  a  finite  number  of  basic  predicates;  others 
can  be  formed  as  simple  logical  combinations  of  these.  (Here  we  will  rely  on  the  conventions 
of  ordinary  logic,  instead  of  creating  a  special  grammar  of  predicates  for  a  given  semiotic 
system;  but  note  that  higher  order  logic,  and  other  extensions  of  first  order  logic,  may  be 
needed.)  For  example,  in  a  semiotic  system  for  graphics,  we  may  have  predicates  like  RED 
and  SQUARE.  Note  that  predicates  can  also  be  constructed  using  Functions  like  COLOR, 
SIZE  and  BRIGHTNESS.  These  associated  predicates  and  functions  express  basic  properties 
of  entities  at  a  given  level  of  the  semiotic  hierarchy.  Like  constructor  functions,  their 
arguments  may  be  restricted  to  particular  classes  of  entities  at  their  level.  We  will  use  the 
notation 

p:  <cl>...<cn> 

to  indicate  that  p  is  a  predicate  having  n  arguments,  where  the  ith  argument  must  lie  in  class 
<ci>.  p(el,...,en)  is  thus  either  true,  false,  or  undefined,  and  is  well-formed  only  if  ei  is  of 
class  <ci>  for  i=l,...,n.  Functions  may  also  have  their  arguments  restricted  to  particular 
classes  in  the  same  way. 

It  should  be  noted  that  there  is  a  standard  way  to  reduce  functions  to  relations.  For  example, 
we  may  represent  a  real-valued  function  f(el,e2)  as  a  relation 

F.  <cl><c2><real> 

where  <cl>,  <c2>  refer  to  the  classes  of  el,  e2  respectively.  Then  f(el,e2)=r  if  and  only  if 
F(el,e2,r)=true.  We  can  also  consider  arguments  and/or  values  that  are  not  necessarily 
entities;  non-entity  arguments  correspond  to  parameters,  and  may  be  restricted  to  specific 
parameter  sets.  For  example,  g(pl,p2,el,e2)  corresponds  to  the  relation 

‘This  is  a  limited  and  technical  sense  of  'context.*  A  broader  sense  has  been  given  earlier;  a  still  broader 
sense  would  take  account  of  pragmatics. 
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G:  <pl>  <p2>  <cl>  <c2>  <real> 

where  <pl>,  <p2>  refer  to  the  parameter  sets  of  pi,  p2,  and  <cl>,  <c2>  refer  to  the 
classes  of  el,  e2  respectively.  Then  g(pl,p2,el,e2)=r  if  and  only  if  G(pl,p2,el,e2.r)=true. 
Also  note  that  higher  order  relations  can  play  a  significant  role  in  some  cases;  in  fact,  the 
priority  ordering  on  constructors  is  such  a  higher  order  relation. 

The  systematic  construction  of  new  entities  from  parts,  either  at  the  same  level  or  at  the  next 
lower  level,  provides  another  kind  of  hierarchical  structure,  that  of  entities  at  a  given  level. 
From  another  perspective,  one  may  speak  of  analyzing  a  given  entity  in  terms  of  other  entities 
at  the  same  or  lower  levels.  A  familiar  example  from  high  school  English  is  diagramming  the 
syntactic  structure  of  a  sentence.  Such  a  diagram  shows  the  division  of  a  sentence  into  parts, 
called  ■phrases,*  and  moreover  explicitly  shows  the  relationships  of  subordination  among 
these  phrases;  that  is,  it  shows  which  phrases  are  sub-phrases  of  other  phrases,  and  also  what 
class  they  belong  to.  (Examples  of  classes  are  •noun,*  'verb  phrase,*  'prepositional  phrase,* 
etc.) 

There  are  many  different  systems  of  notation  for  the  analysis  of  sentence  structure  and  for 
similar  structured  entities  in  other  semiotic  systems,  most  of  which  rely  upon  special 
conventions  suited  to  the  case  at  hand.  Our  intention  here  is  to  give  a  uniform  tree  notation 
that  applies  to  any  semiotic  system  having  constructor  functions  as  described  above.  If 
e=r(el,...,en),  then  we  shall  represent  e  as  a  tree  with  root  node  labelled  r,  where  r  has  n 
branches  corresponding  to  el,...,en,  each  of  which  is  either  a  subtree  (constructed  in  the  same 
way  recursively)  or  else  is  an  entity  of  the  next  lower  level.  In  some  cases,  it  may  also  be 
helpful  to  label  edges  with  the  classes  to  which  they  correspond,  e.g.,  the  label  of  the  edge 
from  the  root  to  ei  might  be  labelled  <ci>.  For  example,  the  sentence  *The  light  on  the  left 
comes  on*  can  be  diagrammed  as  in  Figure  6.  Such  diagrams  can  also  be  decorated  with 
parameters  as  subscripts  to  node  names  (or  in  parentheses  after  node  names).  We  note  again 
that  the  purpose  of  such  diagrams  is  to  show  the  internal  part/subpart  structure  of  a  given 
entity  at  a  given  level  within  a  given  semiotic  system.  It  is  also  worth  noting  that  such  an 
abstract  'parse  tree*  of  a  sign  gives  an  ordering  to  the  entities  which  comprise  it,  namely  the 
left-to-right  ordering  of  the  'frontier*  (i.e.,  the  leaf  nodes)  of  the  tree.  We  call  this  ordering 
the  Intrinstie  ordering  of  these  parts. 


Sentence 

/  \ 
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Figure  6:  Part/Subpart  Hierarchy  for  a  Compound  Sign 

Many  aspects  of  this  approach  to  the  structure  of  signs  seem  to  be  present  or  implicit  in 
Peirce’s  treatment  of  the  semiotics  of  propositions.  But,  as  far  as  we  can  tell,  these 
considerations  were  never  explicitly  assembled  into  a  single  definition.  Our  purpose  in  doing 
so  below  is  to  make  as  precise  and  explicit  as  possible  what  is  involved  in  constructing  (or  in 
attempting  to  construct)  optimal  explanations  or  other  representations  of  instructional 
material.  (This  application  is  considered  in  more  detail  in  the  Section  3.4.) 

The  basic  insight  underlying  this  definition  of  semiotic  system  is  that  semiotic  events  are  not 
isolated  phenomena,  but  rather  occur  in  systems:  there  are  common  rules  relating  to  the 
recognition,  construction,  denotation  and  interpretation  of  such  a  collection  of  signs. 
Moreover,  semiotics  as  a  subject  is  (or  should  be)  more  concerned  with  such  rules  than  with 
the  comparative  study  of  individual  signs  and  the  settings  in  which  they  are  found.  (This 
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distinction  is  like  the  distinction  between  descriptive  biology  and  modern  biology  that  is  based 
on  biochemistry  and  molecular  biology.)  We  repeat  that  Definition  1  merely  embodies  our 
current  understanding  of  the  structural  elements  that  are  involved  in  semiotics  and  can  be 
expected  to  change  as  that  understanding  improves. 

Definition  1:  A  semiotic  system  consists  of  four  classes  of  entities: 

1.  Signs, 

2.  Objects,  and 

3.  Interpretants. 

such  that  each  class  of  entities  (except  the  first)  is  divided  into  levels  (not  necessarily 
disjoint),  some  of  which  may  be  more  basic  than  others,  such  that  entities  at  level  n+l  are 
constructed  from  entities  at  level  n  (and  other  entities  at  level  n+l)  by  use  of  a  fixed  set  of 
constructor  functions  (which  may  also  have  parameters  and  context  conditions).  In 
addition,  there  may  be  a  priority  (partial)  ordering  on  these  functions  at  each  level. 
Finally,  there  may  be  predicates,  relations  and  functions  expressing  properties  of  entities  at 
each  level  of  each  stage.  [] 

We  may  illustrate  the  concepts  in  this  definition  with  examples  from  the  semiotic  system  of 
spoken  English.  The  underlying  medium  is  sound,  that  is,  physical  vibrations.  The  signs  are 
classed  into  the  usual  levels  of  spoken  English,  phonemes,  morphemes,  words,  phrases, 
sentences,  and  discourse  units.  Constructors  at  the  sentence  level  are  given  by  rules,  as 
previously  described.  Objects  and  interpretants  are  more  problematic;  it  seems  fair  to  say 
that  it  is  the  objective  of  current  research  in  Cognitive  Science  and  Artificial  Intelligence  to 
construct  suitable  entities  for  these  classes,  and  to  write  programs  for  processing  them. 

The  purpose  of  this  definition  is  to  explicitly  describe  the  structure  of  a  system  of  related 
signs,  in  order  to  facilitate  the  construction  of  good  representations  of  signs  from  one  system 
by  signs  from  another  system.  The  next  subsection  addresses  such  representations. 

3.4  Semiotic  Morphisms 

This  subsection  focusses  on  our  primary  concern  in  semiotics,  which  is  the  translation  of  signs 
in  one  system  to  signs  in  another  system.  It  is  our  intention  to  provide  the  theoretical 
background  for  a  general  theory  of  the  construction  and  interpretation  of  signs.  For  example, 
generating  an  optimal  (or  at  least  reasonably  good)  explanation,  generating  appropriate 
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graphical  icons,  choosing  a  good  file  name,  choosing  a  good  analogy,  and  understanding  texts 
and/or  graphics,  can  all  be  seen  as  problems  of  translating  signs  from  one  sign  system  into 
another.  Notice  that  the  problem  of  choosing  an  optimal  mix  of  media  also  falls  in  this 
framework,  since  we  can  regard  the  signs  of  each  media  mixture  as  forming  a  subsystem  of  the 
total  sign  system  within  which  we  must  choose  representations.  This  subsection  addresses 
general  questions  about  the  nature  of  translations  between  sign  systems,  and  the  reasons  for 
preferring  one  translation  to  another.  In  order  to  formulate  such  questions  with  sufficient 
generality,  we  first  introduce  another  basic  concept,  that  of  a  sign  system. 

Definition  2:  A  sign  system  is  a  class  of  entities,  called  signs,  divided  into  a  set  of  levels 
(numbered  1  to  N  and  not  necessarily  disjoint),  some  of  which  may  be  more  basic  than  others, 
such  that  entities  at  level  i+1  (for  l<i<N)  are  constructed  from  entities  at  level  i  (and  other 
entities  at  level  i+1)  by  use  of  a  fixed  set  of  constructor  functions  (these  may  also  have 
parameters  and  context  conditions).  In  addition,  there  may  be  a  priority  (partial)  ordering 
on  these  constructors  at  each  level.  Finally,  for  each  level,  there  may  be  predicates,  relations 
and  functions  expressing  properties  of  signs  at  that  level.  [] 

.Artificial  systems  often  exhibit  the  structures  in  this  definition  in  a  very  natural  way.  For 
example,  let  us  consider  a  simple  line-oriented  editor  for  a  standard  24  line  by  80  character 
screen.  The  lowest  hierarchical  level  is  that  of  characters,  the  second  that  of  lines,  the  third 
that  of  screenfulls,  and  the  last  that  of  sequences  of  screens;  thus,  entities  at  the  second  level 
consist  of  strings  of  80  or  fewer  characters,  and  entities  at  the  third  level  consist  of  strings  of 
24  or  fewer  lines.  We  can  see  this  simple  system  as  having  just  one  constructor  at  each  level 
greater  than  1,  namely  string-of(al,  aN,  N)  with  parameter  N,  which  'strings  together*  N 
entities  at  the  next  lower  level.  The  second  level  has  the  context  condition  0<N<80,  and  the 
third  0<N<24.  Since  there  is  at  most  one  constructor  at  each  level,  the  priority  ordering  is 
trivial.  However,  there  are  some  interesting  predicates  and  functions,  such  as  the 
LINELENGTH  function  for  lines,  and  the  ALPHANUMERIC  predicate  for  characters. 

A  more  sophisticated  editor,  specifically  oriented  toward  text  editing,  might  have  character, 
word,  sentence,  and  paragraph  among  its  levels.  It  might  have  one  sentence  level  constructor 

for  each  possible  final  punctuation,  e.g.,  SENT.(al,  ...,  aN,  N),  SENT’fal .  aN.  N), 

SENT!(al .  aN.  N),  each  with  parameter  N  and  context  condition  N<0.  Here  SENT. 

clearly  has  priority  over  SENT?  and  SENT!.  (Note  that  many  editors  in  current  use  provide 
both  line-oriented  and  sentence-oriented  commands.) 


We  can  also  illustrate  Definition  2  in  the  domain  of  computer  graphics.  Example  entities  are 
lines,  characters,  circles  and  squares  Levels  might  consist  of  pixels  (individual  ‘dots*  on  the 
screen),  lines,  simple  figures,  and  windows  (consisting  of  arbitrary  'scenes,*  collections  of 
entities  at  lower  levels,  plus  other  windows);  each  entity  at  each  level  must  also  have 
associated  attributes  for  location  on  the  screen,  and  size;  there  may  also  be  attributes  for  color 
and  intensity.  The  most  interesting  constructor  here  is  WINDOW,  which  can  encapsulate  any 
collection  of  entities  from  any  levels. 

The  classes  of  signs,  objects  and  interpretants  involved  in  a  semiotic  system  each  form  a  sign 
system,  as  follows  directly  by  comparing  the  above  definition  with  that  of  semiotic  system. 

Now  the  main  concept  of  this  section,  which  is  intended  to  capture  the  notion  of  mapping 
signs  from  one  system  to  representations  as  signs  in  another  system.  Such  a  mapping  may  or 
may  not  preserve  the  structure  of  a  sign  system.  The  degree  to  which  it  does  so  affects  the 
quality  of  its  representations,  as  made  precise  in  Definition  4  below. 

Definition  3:  Let  51  and  52  be  sign  systems.  Then  a  (semiotic)  morphism  M:  Si  — *  S2, 
from  Si  to  S 2,  consists  of  the  following  partial  functions  (all  denoted  \1): 

1.  Entities  of  Si  — *  Entities  of  S2, 

2.  Levels  of  Si  — ►  Levels  of  S2, 

3.  Classes  of  Si  at  Level  i  — *  Classes  of  S2  at  Level  M(i), 

4.  Constructors  of  Si  at  Level  i  — ►  Constructors  of  S2  at  Level  M(i),  and 

5.  Property  Predicates  and  Functions  of  Si  at  Level  i  — ►  Property  Predicates  and 
Functions  of  S2  at  Level  M(i), 

such  that 

1.  if  i<j  (for  l<i,j<N,  where  N  is  the  number  of  levels  of  Si)  then  M(i)<M(j), 

2.  if  r:  <cl>...<cn>  — »  <c>  is  a  constructor  at  level  i  of  Si,  then  M(r): 
\l(<cl>)...M(<cn>)  — ►  \l(<c>)  is  a  constructor  at  level  \l(i)  of  S2  (if  it  is  defined), 
and 

3.  if  p:  <cl>...<cn>  is  a  predicate  at  level  i  of  Si,  then  M(p):  M(<cl>)...M(<cn>)  is 
a  predicate  at  level  \l(i)  of  S2  (if  it  is  defined)8. 

g 

As  in  the  previous  subsection,  by  translating  functions  to  relations,  this  condition  applies  to  property 
functions  as  well,  and  a  slight  generalization  also  permits  translating  arguments  and/or  values  that  are  not 
entities. 
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We  will  say  that  \1  preserves  entity  e  at  level  i  (for  l<i<N,  where  N  is  the  number  of 
levels  of  5l)  if  M(i)  is  defined  and  M(e)  is  at  level  Nl(i)  in  S2.  Then  M  preserves  level  i  if  M 
preserves  all  entities  at  level  i  (for  which  it  is  defined),  and  M  is  level  preserving  if  it 
preserves  all  levels  of  Si  (for  which  it  is  defined). 

We  say  that  M  preserves  constructor  r:  <cl>...<cn>  — ►  <e>  (at  level  i)  Tor  entities 

el en  if  r( e  1 . en)  is  defined,  if  M(r)(M(el) . M(en))  is  defined,  and  if  it  equals 

M(r(el en)).  Then  M  preserves  constructor  r  if  it  preserves  r  at  all  entities  for  which  r  is 

defined:  and  M  preserves  constructors  (at  level  i)  if  it  preserves  all  constructors  (at  level  i 
for  which  it  is  defined)  of  51. 

If  r  and  r'  are  constructor  functions  of  S  and  r>r'  (r  has  priority  r')  in  5l,  then  we  say  that  M 
preserves  the  priority  of  r  over  r'  if  M(r)>\l(r')  in  52,  provided  that  M(r)  and  M(r')  are 
defined.  M  is  priority  preserving  if  it  preserves  all  priorities  in  Si  (for  entities  where  it  is 
defined). 

Next,  we  say  that  M  preserves  a  property  relation  p:  <cl>...<cn>  of  Si  provided  that 
M(p)  is  defined  and  M(p)(M(el),...,M(en))  holds  whenever  p(el,...,en)  holds,  for  ei  of  class 
<ci>  in  Si9.  Also,  M  is  property  preserving  if  it  preserves  all  properties  of  Si  (for  which 
it  is  defined). 

Finally,  we  sav  that  M  is  structure  preserving  if  it  is  level  preserving,  constructor 
preserving,  priority  preserving,  and  property  preserving.  [] 

These  careful  distinctions  about  what  kind  of  structure  might  be  preserved  by  sign 
representations  from  one  system  with  signs  from  another  will  be  used  in  Section  6  to  formulate 
precise  experimental  hypotheses  about  the  quality  of  representations. 

It  is  important  to  notice  that  semiotic  morphisms  need  not  be  totally  defined;  that  is,  each  of 
the  functions  denoted  M  can  be  undefined  on  some  of  what  is  in  51.  For  example,  there  need 
not  be  any  representation  in  52  for  some  entities  in  51;  in  particular,  some  components  of  M 
could  even  be  totally  undefined,  i.e.,  the  empty  function. 

An  example  of  a  semiotic  moTphism  is  the  correspondance  between  the  physical  order  of  lights 
on  the  box.  and  the  order  in  which  clauses  are  given  to  describe  the  lights  (narrative  order). 

9This  extends  to  functions  and  to  non-entity  arguments  and/or  values  as  before. 


The  processes  of  signification  and  interpretation  (i.e.,  of  constructing  objects  and  interpretants 
within  a  semiotic  system)  might  be  viewed  in  the  light  of  semiotic  morphisms,  since  the 
entities  that  they  map  from  and  to  are  both  sign  systems.  This  only  makes  sense  because 
semiotic  morphisms  can  be  partial  functions.  For  example,  it  is  often  the  case  that  low  level 
signs  in  complex  systems,  such  as  phonemes  in  the  English  natural  language  semiotic  system, 
seem  not  to  have  either  denotations  or  interpretations.  Moreover,  there  is  very  little  structure 
other  than  sequential  succession  to  preserve  at  these  levels. 

It  seems  clear  that  a  structure  preserving  semiotic  morphism  M:  Si  — *■  52  will  faithfully 
represent  all  of  the  semiotic  structure  of  Si  in  terms  of  that  available  in  52.  This  would  seem 
to  be  desirable  for  an  optimal  representaton;  however,  if  the  resulting  structures  in  Si  are  too 
complex,  then  they  may  be  hard  for  human  beings  to  understand,  and  thus  not  really  optimal. 
For  example,  if  Si  consists  of  parse  trees  for  English  sentences  and  52  consists  of  the  usual 
•printed  page*  text  format,  then  it  is  possible  to  translate  all  the  syntactic  information  that  is 
available  in  Si  into  structures  in  52  with  so-called  “phrase  structure*  notation,  which  uses 
brackets  to  delimit  phrases  and  uses  subscripts  on  brackets  to  indicate  the  class  of  phrase  is 
involved.  For  example,  the  sentence  given  previously  would  be  represented  by  something  like 

[ll^Hoet  ll'&ht]jq]isjp[lon]prep[[tke!DeJleH'l[y]>jp]pp]\p[[comes]vlonlparthT^Sentence 
in  this  notation.  This  may  be  useful  for  some  purposes,  but  it  is  clearly  not  optimal  for  all 
purposes.  The  point  to  be  noted  is  that  there  is  some  kind  of  a  trade-off  between  the  degree 
of  structure  preservation  and  the  degree  of  complexity  of  the  resulting  representations. 

We  now  turn  to  the  rather  delicate  issue  of  determining  whether  one  representation  (i.e., 
semiotic  morphism)  is  better  than  another.  One  evident  consideration  is  whether  it  preserves 
more  structure  than  the  other  (of  course,  this  will  make  it  better  only  if  the  complexity  of  its 
representations  are  not  too  great). 

Definition  4:  Let  M'  and  M  be  morphisms  from  sign  system  51  to  sign  system  52.  Then  M' 
preserves  more  structure  than  M  does,  provided  that: 

1.  if  M  preserves  an  entity  e  at  level  i,  then  so  does 

2.  if  M  preserves  a  constructor  r  at  entities  el . en,  then  so  does  M'; 

3.  if  M  preserves  a  priority  r>r/  then  so  does  M';  and 

4.  if  M  preserves  a  property  p,  then  so  does  M'.  [j 


The  real  difficulties  arise  in  trying  to  compare  morphisms  M  and  M'  such  that  n«“it h**r 
preserves  strictly  more  structure  than  the  other,  or  for  which  one  preserves  more  structure  but 
also  produces  more  complex  representations.  For  example,  M  might  preserve  more  levels  than 
M',  whereas  M'  preserves  more  properties  than  M.  It  is  for  unclear  cases  such  as  this  that  our 
future  experimental  results  will  be  especially  interesting.  The  framework  that  we  have 
developed  suggests  that  preserving  levels  is  more  basic  than  preserving  priorities,  which  is 
more  basic  than  preserving  properties.  It  is  not  difficult  to  formulate  a  number  of  specific 
experimental  hypotheses  that  will  test  these  suggestions,  and  we  hope  this  will  eventually  lead 
to  a  workable  notion  of  what  it  means  to  be  a  good  representation.  ■Workable"  here  means 
that  it  will  be  possible  to  effectively  determine  of  a  given  representation  whether  or  not  it  is 
adequate  to  the  task  in  hand.  More  optimistically,  given  sign  systems  Si  and  S‘2,  where  Si 
contains  abstract  forms  of  the  information  to  be  conveyed,  it  may  be  possible  to  discover  (to 
•compute"  even)  a  semiotic  morphism  M  from  Si  to  S'2  that  will  give  adequate 
representations  in  S‘2  for  entities  from  Si.  For  example.  Si  might  contain  instructions  for 
repairing  some  piece  of  equipment,  and  S2  might  be  a  color  graphics  terminal.  The  problem  is 
then  to  generate  displays  that  utilize  the  capabilities  of  that  particular  terminal  reasonably 
well. 

Similar  considerations  arise  in  [Gentner  83j’s  discussion  of  successful  and  unsuccessful  natural 
language  analogies.  We  now  quote  from  the  summary  of  that  paper: 

The  structure-mapping  theory  describes  the  implicit  interpretation  rules  of 
analogy.  The  central  claims  of  the  theory  are  that  analogy  is  characteriz'd  by  the 
mapping  of  relations  between  objects,  rather  than  attributes  of  objects,  from  base  to 
target;  and,  further,  that  the  particular  relations  mapped  are  those  that  are 
dominated  by  higher-order  relations  that  belong  to  the  mapping  (the  systematicity 
claim).  These  rules  have  the  desirable  property  that  they  depend  only  on  syntactic 
properties  of  the  knowledge  representation,  and  not  on  the  specific  content  of  the 
domain. 

Our  approach  introduces  a  finer  structure  on  the  source  and  target  domains,  and  thus  permits 
finer  hypotheses  about  what  makes  analogies  good.  In  addition,  our  approach  is  not  limited  to 
natural  language  as  the  target  sign  system,  and  considers  representations  other  than  analogies. 

We  now  introduce  the  notion  of  a  subsystem  of  a  sign  system.  This  notion  has  already  been 
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used  informally  in  the  discussion  of  choice  of  media  mix  at  the  beginning  of  this  subsection, 
and  will  be  used  again  in  the  next  subsection. 

Definition  5:  Given  sign  systems  S  and  S',  we  will  say  that  S'  is  a  subsystem  of  S  provided 
that:  every  level  of  S'  is  also  a  level  of  S;  every  entity  of  S'  is  also  an  entity  of  S  and  entities  in 
S'  have  the  same  level  in  S'  as  they  do  in  S;  every  property  function  and  predicate  in  S'  is  also 
one  in  S.  and  has  the  same  values  in  S'  as  in  S;  every  constructor  function  of  S'  is  also  a 
construtor  of  S.  and  constructors  in  S'  yield  the  same  results  in  S'  as  they  do  in  S,  and  also 
have  the  same  parameters  and  context  conditions;  and  finally,  the  ordering  on  the  constructors 
of  S'  is  the  same  as  that  on  those  constructors  in  5.  [J 

Now  suppose  that  we  are  given  sign  systems  Si  and  S2  and  a  semiotic  morphism  M  from  Si  to 
S2.  Then  the  set  of  entities  M(e)  in  S2  for  which  M  is  defined  for  some  entity  e  in  Si  has:  a 
set  of  levels,  inherited  from  those  of  S2;  a  set  of  constructors,  also  inherited  from  those  of  S2 
(but  they  will  have  to  be  undefined  whenever  combining  entities  of  the  form  M(e)  fails  to  yield 
another  of  the  same  form  in  S2);  a  priority  ordering  on  these  constructor  functions,  namely  the 
same  one  that  S2  has;  and  also  the  functions  and  predicates  from  S2,  now  thought  of  as 
expressing  properties  of  entities  of  the  form  M(e)  for  e  in  Si.  In  short,  the  entities  of  the  form 
M(e)  form  a  subsystem  of  the  sign  system  S2;  we  call  it  the  image  subsystem  of  the  semiotic 
morphism  M. 

3.4.1  Iconicity  and  Naturalness 

As  discussed  in  Section  3.1,  semiotics  distinguishes  three  types  of  sign  --  the  index,  the  icon, 
and  the  symbol.  The  symbol  is  fully  arbitrary,  in  the  Saussurean  sense.  The  index  as  signifier 
is  a  necessary  (or  probable)  concomittant  of  the  signified  --  smoke  as  an  index  of  fire.  It  is  the 
icon  which  poses  the  most  interesting  questions  for  the  relation  of  signifier  and  signified.  The 
accepted  definition  of  the  icon  is  that  it  involves  an  actual  resemblance  between  signifier  and 
signified:  a  portrait  signifies  its  subject  by  resemblance,  not  by  convention.  (Compare,  for 
example,  a  highly  conventionalized  political  caricature.) 

This  definition  implies  a  specific  directionality  to  the  relation  between  signifier  (52  in 
Definitions  3  and  4)  and  signified  (Si),  in  which  the  signified  is  more  natural  or  basic  than  the 
signifier  in  some  sense.  Thus,  it  is  often  assumed  that  a  diagram,  drawing,  or  visual  icon  is 
more  basic,  easier  to  comprehend,  and  freer  of  the  arbitrary  conventions  of  language.  (We 
find  a  folk  theory  of  iconicity  in  the  proverb:  "A  picture  is  worth  a  thousand  words.*) 
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However,  it  is  important  to  note  that  pictures,  diagrams,  etc.,  are  only  partially  iconic  in  this 
sense,  and  also  contain  a  component  of  conventional  representation  that  must  be  learned.  For 
example,  Venn  diagrams  may  appear  fully  iconic  of  boolean  relations  to  someone  accustomed 
to  using  them;  but  to  someone  who  has  not  learned  the  aspect  of  convention  in  this  only 
partially  iconic  semiotic  system,  a  Venn  diagram  may  be  iconic  only  of  a  pretzel  or  symbolic  of 
a  brand  of  beer. 

In  terms  of  the  theoretical  apparatus  introduced  in  Section  3.4,  the  problem  of  generating  an 
iconic  representation  is  one  of  constructing  a  semiotic  morphism  from  one  sign  system  to 
another.  The  considerations  of  the  above  paragraph  suggest  that  the  set  of  all  representations 
that  are  so  generated,  viewed  as  a  sign  system  (this  is  the  image  subsystem  of  the  full  system 
of  possible  representing  signs)  should  be  in  some  sense  simpler,  more  natural,  or  more  basic  (to 
humans)  than  the  original  set  of  signs.  It  is  hoped  that  experimental  explorations  along  these 
lines  will  lead  to  a  deeper  understanding  of  iconicity. 

4  Some  Further  Analytic  Concepts 

We  now  consider  some  additional  concepts  used  to  formulate  the  variables  and  hypotheses  in 
Sections  5  and  6.  We  begin  with  the  possible  cognitive  structures  of  the  task  domain,  and 
then  consider  some  linguistic  issues,  including  syntactic  placement  and  strength,  focus,  and 
indexing  . 

4.1  Cognitive  Structure  of  the  Task  Domain 

The  task  of  explaining  the  logic  box  (Figure  1),  can  be  fulfilled  by  accounts  based  on  at  least 
five  different  ways  of  understanding  the  task  domain: 

1.  Behavioral.  This  is  a  simple,  unanalyzed  description  that  matches  a  pattern  of  lights  to 
corresponding  switch  positions.  Loosely  speaking,  an  account  at  this  level  sounds  like  a 
description  rather  than  an  explanation. 

2.  Combinatorial.  An  account  at  this  level  considers  the  possible  patterns  of  lights  as  an 
aggregate.  This  might  be  displayed  in  a  table,  such  as  a  truth  table  of  the  relation  of 
switch  positions  and  lights.  An  optional  addition  at  this  level  explains  that  only  four 
combinations  of  switch  position  are  possible,  by  simple  multiplication  of  two  switches 
times  two  switch  positions. 

3.  Logical.  Such  an  account  would  indicate  the  Boolean  functions  of  the  switch  positions 
represented  by  each  light,  using  primitive  functions  like  AND,  OR,  and  IF.  Depending 
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on  the  background  of  the  audience,  accounts  at  this  level  would  differ  in  how  full  an 
explanation  of  logical  functions  is  required. 

4.  Electronic.  An  account  at  this  level  might  use  a  circuit  diagram  to  indicate  how  the 
relation  between  switch  position  and  light  patterns  is  accomplished, 
o.  Physical.  An  account  at  this  level  would  utilize  the  principles  of  physics  and  chemistry 
underlying  the  previous  level  of  electronics. 

The  first  three  of  these  form  a  hierarchy  by  levels  of  abstraction,  while  the  first,  fourth  and 
fifth  form  a  hierarchy  by  levels  of  reduction.  We  have  found  instances  of  the  first  three  of 
these  levels  in  instructors'  explanations,  and  the  fourth  in  their  briefings  and  debriefings;  the 
fifth  was  included  for  completeness. 

This  categorization  of  possible  cognitive  organizations  of  the  task  domain  relates  to  work  by  a 
number  of  researchers  on  the  cognitive  organization  of  explanations.  For  example,  [Kieras  82] 
distinguishes  knowledge  of  what  a  device  is  for,  how  to  operate  it,  and  how  it  works.  The  first 
two  levels  of  description  of  our  task  correspond  to  varying  degrees  of  knowledge  of  the  first 
type;  how  to  operate  the  device.  The  third,  fourth,  and  fifth  levels  correspond  to  varying 
degrees  of  knowledge  of  how  the  device  works.  We  note  that  a  description  at  any  of  these  five 
levels  may  function  as  an  explanation,  depending  on  the  purpose  of  the  description  and  the 
existing  level  of  understanding  of  the  audience.  Similarly,  in  a  study  of  Navy  instruction 
manuals,  [Stevens  &  Steinberg  81]  provide  a  typology'  of  explanations  beginning  at  the 
behavioral  level  and  proceeding  to  more  abstract  forms  of  explanations.  (No  exact 
correspondence  between  the  higher  levels  of  their  taxonomy  and  ours  is  possible,  since  ours  is 
a  simple,  non-branching  tree  structure,  while  theirs  is  a  matrix  of  four  two-way  distinctions. 

The  first  round  of  experiments  gave  instructors  highly  nondirective  instructions,  telling  them 
to  teach  students  how  to  check  whether  the  device  was  doing  what  it  was  supposed  to.  This 
produced  explanations  of  types  one  and  two.  Interestingly,  although  the  audience  was  a  group 
of  community  college  students  having  no  background  in  mathematics  or  electronics,  many 
students  found  these  explanations  unsatisfactory,  and  in  .ijv  subsequent  debriefing  session 
requested  further  information.  Examples  of  such  comments  are: 

(6)  The  thing  is  so  easy  to  understand,  I  mean,  it’s,  that  ve 
look  for  the  complications ,  you  knov.  we’re  trying  to 
look  for  something  that  you  know,  what  is,  what’s  there. 


and  we  have  to  really  explain  it  and  you  haveta  get 
into  . . . 

(7)  I  know.  It's  hard  to  explain  because  it's  so  easy. 

These  comments  strongly  suggest  that  the  cognitive  structure  underlying  an  explanation  is  an 
important  variable  for  comprehension.  Our  later  experiments  elicited  explanations  at  other 
cognitive  levels,  by  asking  the  instructor  to  present  the  device  as  the  control  panel  of  a  set  of 
sluices,  where  switches  controlling  gates,  and  the  lights  indicating  whether  or  not  the  sluices 
are  open;  this  required  students  to  understand  not  only  the  current  relations  but  the  basis 
underlying  any  possible  set  of  relations. 

In  our  first  set  of  pilot  experiments,  the  comprehension  task  (for  the  audience)  was  to  write  an 
explanation  of  the  box,  on  the  basis  of  the  explanation  given  by  the  instructor.  A  similar 
procedure  can  be  used  to  test  comprehension  of  any  of  the  cognitive  organizations  listed  above 
(of  course,  the  writing  skills  of  the  subjects  will  also  effect  such  a  measure).  Our  Phase  11 
experiments  will  use  more  focused  test  questions  to  probe  particular  cognitive  organizations. 
Thus,  a  question  such  as  Can  all  four  lights  b«  on  at  once?  can  be  answered  from 
simple  observation  at  the  behavioral  level.  In  contrast,  a  question  like  Does  any  light 
correspond  to  the  logical  function  ((not  Switch  1)  or  (not  Switch  2))?  requires 
some  comprehension  at  the  third  level,  that  of  Boolean  functions. 

4.2  Sentential  Syntax 

A  number  of  issues  at  the  syntactic  and  lexical  levels  appear  to  affect  the  comprehensibility  of 
explanations.  These  include  the  syntactic  placement  of  information  and  the  strength  of 
structural  markers.  Variables  at  this  level  may  not  be  highly  trainable  for  a  human  instructor 
delivering  a  non-scripted  explanation,  but  may  be  extremely  valuable  in  scripting  computer 
produced  or  videotaped  explanations. 

4.2.1  Syntactic  Placement  of  Information 

In  a  semantically  restricted  domain  like  that  of  the  present  study,  it  is  possible  to  examine  the 
syntactic  placement  of  information  quite  precisely.  This  is  valuable  because  syntactic 
placement  is  an  important  organizational  device  for  discourse  that  allows  the  analyst  to 
determine  what  information  is  given  major  importance  and  what  information  is  given  minor 
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importance.  Our  task  domain  involves  three  basic  types  of  information:  information  about  the 
switches,  information  about  the  lights,  and  information  about  relations  between  the  two. 

Linguistic  research  has  shown  that  there  is  a  continuum  of  syntactic  constituents  ranging  from 
the  maximally  sentence-like  on  downward.  It  has  been  further  shown  that  more  important 
information  is,  the  more  it  is  likely  to  be  placed  in  sentence-like  syntactic  constituents  [Linde 
74,  Ross  73].  To  aid  comprehensibility,  it  appears  that  important  information  should  be 
placed  in  syntactically  heavy  constituents,  that  is,  in  constituents  that  are  quite  sentence-like. 
Similarly,  semantically  parallel  information  should  be  placed  in  syntactically  parallel  units. 

4.2.2  Strength  of  Structural  Markers 

Structural  markers  are  pieces  of  text  that  invoke  internal  nodes  of  the  discourse  structure  tree, 
such  as  STATEMENT/REASON,  IF/THEN,  and  EXAMPLE,  or  that  indicate  movement  in 
the  tree.  Our  formal  theory  of  discourse  structure  states  that  the  first  of  these  indicates 
relations  between  pieces  of  information,  while  the  second  type  indicates  change  of  focus  of 
attention.  Text  invoking  these  markers  may  do  so  with  varying  degrees  of  strength.  There 
are  a  number  of  dimensions  that  combine  to  produce  strength  of  markers,  including  weight  of 
the  syntactic  placement  of  the  marker,  degree  of  semantic  ambiguity  or  univocality,  and 
length  in  words. 

4.3  Focus 

To  describe  the  semantics  of  these  explanations,  the  additional  notion  of  focus  is  required. 
The  logic  box  employed  in  our  explanation  task  has  both  lights  and  switches,  and  the  patterns 
of  each  may  vary.  A  coherent  description  must  focus  on  one  of  these,  describing  the  other  in 
terms  of  the  item  in  focus.  (8)  shows  a  focus  on  lights,  while  (9)  shows  a  focus  on  switches. 

(8)  that  happens  is  each  of  those  lights  is  a  logical 
function,  vhich  means  jou  know,  true  and  false,  or 
yes  and  no,  of  the  two  switches.  So,  for  instance, 
this  light  C  is  on,  just  depends  on  Switch  Two.  Now 
whenever  Two  is  up,  C  is  on.  It  doesn't  matter  what 
One  is  on. 

(9)  We’ll  go  through  it  again  slowly.  O.K.  In  one 
position  of  these  two  switches,  where  One  is  up  and 
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Two  is  down,  all  those  lights  are  off.  We  take 
Switch  Oae  and  move  it  to  its  other  position,  three 
of  the  lights  come  on.  We  reverse  this  combination 
and  make  them  both  go  up.  we  got  again  three 
of  the  lights  are  on  but  a  different  three  lights. 

And,  if  we  move  em  till  both  down,  again, 
we  got  three  lights  on  but  these  two  lights  are 
are  changed  over. 

Issues  involving  focus  include  the  question  of  whether  there  is  an  optimal  focus  for  a  given 
task,  and  the  effects  of  maintaining  or  switching  focus.  Preliminary  results  suggest  that 
changes  of  focus  are  confusing  and  that  poor  placement  within  the  explanation  structure  can 
make  them  even  more  confusing;  (10)  is  an  example  of  sue’  a  change. 

(10)  So.  in  a  condition  where  they’re  all  off,  this  switch. 

Two.  is  down,  and  this  switch.  One.  is  up.  and  if  we 
change  the  position  of  just  one  switch,  we’ll  change  the 
condition  of  the  lights.  So  we’ll  go  from  all  off  to 
three  of  these  lights  going  on  and  the  three  lights  that 
come  on  are  A,  B.  and  0.  If  we  go  back  to  thiB  situation 
which  is  where  we  started,  they’ll  all  go  back  off  again. 

This  explanation  begins  with  a  focus  on  the  switches  and  changes  in  the  middle  to  a  focus  on 
the  lights. 

The  taxonomy  of  explanation  types  given  in  [Stevens  &  Steinberg  81]  contains  a  number  of 
distinctions  that  correspond  to  this  notion  of  focus;  these  are  distinctions  at  the  same  level  of 
abstraction,  such  as  a  *stuff-state-attribute*  description  of  a  physical  system,  versus  a  "stuff- 
as-a-transport  medium*  description  of  the  same  system. 

4.4  Prior  Text  Reference 

To  understand  explanations  (or  indeed  any  discourse  type)  we  must  take  account  not  only  of 
the  linguistic  form  of  the  explanation  and  its  semantic  structure,  but  also  of  the  knowledge 
shared  by  its  speaker  and  addressees.  In  recent  years,  the  fields  of  cognitive  science, 
linguistics,  and  artifical  intelligence  have  all  been  concerned  with  the  effects  of  shared 
knowledge  on  linguistic  and  cognitive  structures.  The  present  discussion  is  concerned  with  the 


linguistic  forms  that  speakers  use  to  indicate  that  a  particular  body  of  shared  knowledge  is 
relevant  or  necessary  in  order  to  understand  a  given  explanation. 

.As  we  define  it,  a  prior  text  reference  is  a  pointer  in  a  given  text  to  some  body  of 
information  not  present  in  that  text,  or  to  some  prior  text  that  speaker  and  audience  are 
presumed  to  share.  This  definition  derives  from  the  discussion  in  [Becker  81]  of  what  they 
term  indexing  of  prior  text.  We  have  changed  the  term  to  avoid  confusion  with  Peirce's 
related  but  different  sense  of  the  term  index.  Prior  text  reference  can  be  accomplished 
explicitly,  as  in  As  we  were  discussing  last  week  about  circuits,  and  it  may  can 
accomplished  implicitly,  as  in  Does  this  have  to  do  with  circuitese?  For  a  given  prior 
text  reference  to  succeed,  it  must  indicate  a  body  of  knowledge  or  a  prior  text  that  the 
audience  actually  has  mastered.  Thus,  the  reference  If  you  remember  the  commutative 
law  from  high  school  algebra  will  succeed  only  if  the  audience  remembers  the 
commutative  law  from  high  school  algebra.  .Another  example  from  our  data  is  the  statement 
that  the  operation  of  the  logic  box  is  like  a  set  of  traffic  lights.  This  will  succeed 
only  if  the  audience  does  in  fact  know  enough  about  how  traffic  lights  work.  Section  6.1  gives 
hypotheses  about  prior  text  reference. 

5  Variables  of  Interest 

This  section  discusses  some  variables  applicable  to  our  data  that  appear  to  be  important  for 
the  comprehensibility  of  instruction;  these  variables  are  used  in  the  hypotheses  of  the  following 
section.  We  expect  that  further  variables  will  be  found  as  the  research  progresses. 

5.1  Discourse  Level  Semantic  Variables 

5.1.1  Cognitive  Structure  of  the  Task  Domain 

.As  discussed  in  Section  4.1,  explanations  can  be  based  five  different  cognitive  organizations  of 
the  task  domain:  behavioral,  combinatorial,  logical,  electronic,  or  physical.  Similarly,  the 
comprehension  task  can  probe  any  of  these. 


5.1.2  Focus 

An  explanation  can  be  based  either  on  the  condition  of  the  lights  or  on  the  positions  of  the 
switches,  treating  the  other  as  functionally  dependent  on  the  one  chosen  as  basic.  Similarly, 
the  comprehension  task  can  be  based  either  on  the  lights  or  on  the  switches. 

5.1.3  Prior  Text  Reference 

Prior  text  reference,  as  defined  in  Section  4.4,  may  be  present  or  absent  in  any  given  part  of 
an  explanation. 

5.1.4  Form  of  Prior  Text  Reference 

Prior  text  reference  may  be  either  explicit  or  inferential.  Boolean  algebra  tells  us  ...  is 
an  explicit  prior  reference.  Is  that  circuitese?  is  inferential. 

5.2  Variables  of  Discourse  Structure 

This  subsection  uses  the  distinction  between  discourse  type  and  discourse  unit  introduced  in 
Section  2.1. 

5.2.1  Discourse  Type 

An  explanation  may  consist  of  any  of  several  discourse  types,  including  reasoning,  narrative, 
and  pseudonarrative. 

5.2.2  Number  of  Discourse  Units 

An  explanation  may  contain  one  or  more  instances  of  each  of  its  discourse  types,  and  it  may 
consist  of  several  different  discourse  types.  For  example,  a  single  explanation  may  consist  of 
three  reasoning  units,  it  may  consist  of  one  reasoning  unit  and  two  pseudonarrative  units,  etc. 

5.2.3  Presence  or  Absence  of  Discourse  Summary 

.Any  of  the  discourse  types  that  have  been  found  to  perform  the  function  of  explanation  may 
have  as  part  of  their  structure  an  optional  summary,  giving  an  overview  of  the  entire 
explanation,  or  of  some  part  of  it. 
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5.2.4  Placement  of  Summary 

Within  a  given  discourse  unit,  a  summary  may  be  placed  at  the  beginning,  at  the  end,  or  may 
be  embedded  within  the  discourse  unit.  (This  distinction  between  an  initial  and  a  final 
summary  is  related  to  the  distinction  commonly  made  in  rhetoric  between  deductive  and 
inductive  paragraph  structure.  A  deductive  structure  places  the  topic  sentence  at  the 
beginning  of  the  paragraph  and  follows  it  with  supporting  material;  an  inductive  structure 
begins  with  cases  or  examples  building  up  to  a  final  general  statement.) 

5.2.5  Presence  or  Absence  of  Explicit  Structural  Markers 

In  the  construction  of  discourse  structure  trees,  transformations  can  establish  internal  nodes  in 
the  tree,  and  can  also  alter  the  focus  of  attention  within  the  tree.  These  functions  may  be 
accomplished  explicitly,  by  separate  pieces  of  text,  or  they  may  be  accomplished  implicitly,  as 
part  of  the  semantics  of  text  primarily  devoted  to  content.  Such  implicit  markers  depend  on 
the  fact  that  each  discourse  type  has  a  characteristic  default  node  type.  For  example,  in  a 
narrative,  the  default  node  type  is  SEQ,  corresponding  to  the  narrative  presupposition,  the 
rule  of  interpretation  stating  that  events  are  assumed  to  have  occurred  in  the  same  order  as 
the  main  clauses  that  refer  to  them.  Thus,  in  a  narrative,  it  is  sufficient  to  say  He  moved  the 
second  switch.  Two  lights  went  out.  It  is  possible,  but  not  necessary  to  add  a  marker 
such  as  and  then,  getting  He  moved  the  second  switch  and  then  two  lights  went  out. 

5.2.6  Strength  of  Marker 

In  the  case  where  an  explicit  structural  marker  is  present,  we  may  ask  bow  strongly  the 
marker  is  indicated.  Strength  of  indication  depends  on  a  number  of  factors: 

1.  Syntactic  heaviness  of  the  marker,  i.e.,  whether  it  is  a  single  conjunction,  a  phrase,  a 
dependent  clause,  or  a  sentence. 

2.  Length  in  words  of  the  marker.  (This  is  related  but  not  identical  to  1.) 

3.  Explicitness  of  the  marker.  For  example,  a  marker  like  so  is  quite  inexplicit,  and  may 
indicate  causality,  simple  sequence,  resumption  of  a  previous  topic,  etc.  In  contrast,  a 
marker  like  btcaust  of  that  indicates  causality  explicitly  and  unambigously. 

4  Position  in  the  sentence.  A  position  3t  the  beginning  of  the  sentence  is  heavier  than  a 
later  one  there  are  a  number  of  syntactic  devices  in  English  that  can  move  a  constituent 
to  front  position. 
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5.2.7  Penetrance  of  Discourse  Tree 

Different  choices  of  node  type  and  different  orderings  of  subordinate  nodes  (which  correspond 
to  different  embeddings  of  clauses)  can  lead  to  a  variety  of  different  tree  shapes.  In  general, 
discourse  trees  can  be  described  as  fundamentally  deep  structures,  or  broad  structures.  A 
relevant  variable,  related  to  a  quantity  called  penetrance  in  artifical  intelligence  [Nilsson  71], 
is  the  ratio  of  the  average  path  length  A  to  the  total  number  N  of  nodes.  Thus,  P  =  A/N  = 
-ILiLj/TN,  where  T  is  the  number  of  paths  and  Lj  is  the  length  of  the  ith  path,  since  A  = 
/T.  P  is  larger  for  deep  trees  and  smaller  for  broad  or  shallow  trees. 

5.2.8  Explicit  Establishment  of  the  Basis  of  Parallel  Structure 

Many  of  the  explanations  in  our  data  contain  parallel  structures.  For  example,  there  may  be 
four  subtrees  eorreponding  to  the  four  lights.  Or,  in  a  differently  organized  explanation,  there 
may  be  four  subtrees  corresponding  to  the  four  possible  switch  positions.  A  variable  of 
interest  is  whether  or  not  the  basis  of  this  parallel  structure  is  made  explicit.  This  could  be 
done  by  a  reference  to  the  existence  of  the  four  lights  in  the  first  case,  or  to  the  simple 
computation  of  two  switches  times  two  switch  positions  in  the  second.  Because  of  the 
existence  of  the  demonstration  prop,  if  the  focus  is  on  the  lights,  the  explicit  establishment  of 
the  four  lights  can  be  accomplished  simply  by  directing  attention  to  the  box  itself.  It  seems 
important  that  parallel  structures  in  the  text  be  ordered  in  a  wav  that  clearly  corresponds  to 
the  geometry  of  the  physical  world  in  such  a  case.  Such  an  iconicity  between  visual  and 
linguistic  representations  would  be  particularly  important  in  generating  graphics  in  computer 
aided  instruction. 

5.3  Sentential  Level  Variables 

5.3.1  Syntactic  Placement  of  Information 

A>  discussed  in  Section  -4.2,  the  syntactic  form  of  the  constituent  in  which  information  is 
placed  is  one  indication  of  its  presumed  importance.  This  variable  appears  in  a  number  of 
hypotheses. 
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6  Hypotheses 

As  already  noted,  Phase  II  of  this  project  will  test  the  most  interesting  hypotheses  suggested 
during  the  analysis  of  Phase  I  data,  by  presenting  experimentally  varied  explanations  (of  the 
logic  box  task)  to  learners.  Once  the  hypotheses  have  been  selected  and  the  test  explanations 
constructed,  we  will  develop  suitable  dependent  measures  of  comprehension  (verbal  and/or 
behavioral).  Subjects  will  be  randomly  assigned  to  experimental  conditions  so  that,  although 
subject  knowledge  and  education  may  have  some  influence,  it  will  not  be  specific  to  any 
condition.  Thus,  hypotheses  about  explanation  effectiveness  can  be  tested  in  terms  of  student 
comprehension. 

This  section  gives  some  candidate  hypotheses  found  so  far.  We  expect  that  the  statement  of 
these  hypotheses  will  be  refined  during  the  process  of  testing;  i.e.,  that  many  of  these 
hypotheses  will  not  be  tested  in  exactly  the  form  given  here,  and  some  will  not  be  tested  at  all. 
However,  they  all  seem  to  represent  reasonable  intuitions  about  instructional  processes  and 
valuable  directions  for  further  investigation.  The  hypotheses  to  be  tested  will  be  selected 
according  to  the  following  criteria: 

1.  Potential  application  of  the  hypothesis  to  the  improvement  of  human  or  computer  based 
instruction. 

2.  Possibility  of  incorporating  and  varying  the  variables  of  interest  in  instructional 
discourse. 

3.  Possibility  of  accurate  measurement  of  the  degree  of  variation. 

4.  Possibility  of  holding  other  variables  constant. 

.\otice  that  dependent  measures  can  be  constructed  in  at  least  three  ways:  (1)  performance  on 
physical  tasks  involving  actual  use  of  the  logic  box;  (2)  performance  on  comprehension  tasks, 
such  as  multiple  choice  questions;  and  (3)  performance  on  the  task  of  generating  an 
explanation  based  on  that  given  by  the  instructor.  Tasks  of  type  (2)  and  (3)  can  be  aimed  at 
any  of  the  five  levels  given  in  Section  4.1,  but  tasks  of  type  (3)  would  be  difficult,  to  score. 
Now  the  hypotheses,  subdivided  into  three  main  categories. 
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8.1  Discourse  Level  Semantic  Hypotheses 

1.  Explanations  at  the  combinatorial  and  logical  levels  will  result  in  superior 
comprehension.  (Explanations  at  the  behavioral  level  will  result  in  in  f«-rn  >r 
comprehension  because  their  cognitive  structure  is  relatively  simple  and  therefore  they 
must  rely  on  rote  memory;  explanations  at  the  electronic  or  physical  levels  will  result  in 
inferior  comprehension  because  they  are  too  abstract  for  most  learners.  Note  that  this 
assumes  dependent  measures  based  on  performance  of  tasks  at  the  intermediate  levels, 
since  explanations  whose  cognitive  level  matches  that  of  the  comprehension  instrument 
will  get  the  best  scores.) 

2.  Comprehension  will  be  impaired  if  the  focus  changes  among  the  direct  subordinates  <,f 
an  AND,  OR  or  SEQ  node 

3.  Comprehension  will  be  assisted  if  the  focus  of  a  summary  corresponds  to  the  focus  of  the 
nodes  that  are  being  summarized 

4.  Explanations  based  on  semiotic  morphisms  that  preserve  the  level  structure,  especially 
the  basic  levels  (if  there  are  any),  will  result  in  superior  performance  to  morphisms  that 
do  not. 

5.  Explanations  based  on  semiotic  morphisms  that  preserve  primary  constructors  (if  there 
are  any)  will  result  in  superior  performance  to  explanations  based  >n  morphism-  that  d •• 
not. 

6.  Explanations  based  on  semiotic  morphisms  that  preserve  properties  at  the  expense  of 
basic  levels  or  primary  constructors  will  produce  inferior  performance  to  explanations 
based  on  morphisms  that  preserve  basic  levels  or  primary  constructors  at  the  expense  of 
properties. 

7.  Comprehension  will  be  assisted  by  the  presence  of  prior  text  reference 

6.2  Hypotheses  at  the  Level  of  Discourse  Structure 

8.  Comprehension  will  be  assisted  by  the  inclusion  of  summaries 

9.  Comprehension  will  be  assisted  more  by  initial  placement  of  summaries  than  by  medial 
or  final  placement. 

10.  Comprehension  will  be  assisted  more  by  a  broad  tree  than  by  a  deep  tree.  A  more 
precise  formulation  of  this  hypothesis  is  that  structures  with  larger  penetrance  will  be 
more  easily  comprehended  than  structures  with  small  penetrance. 

11.  Comprehension  will  be  assisted  by  explicit  markers  of  discourse  structure;  those  may  be 
either  verbal  or  visual. 

12.  Comprehension  will  be  hampered  by  interruption  of  parallel  structures,  even  if  the 
interruption  represents  a  summary. 


48 


13.  Complexity  of  linguistic  structure  and  strength  of  marking  tend  to  attenuate  within  a 
parallel  structure,  and  the  greater  the  number  of  parallel  items,  the  greater  the  degree  of 
attenuation.  Comprehension  will  by  assisted  by  or  unaffected  by  such  attenuation  of 
structural  marking  if  the  basis  of  the  parallelism  has  been  comprehended  by  the 
audience. 

14.  For  novices  comprehension  will  be  superior  with  pseudonarrative  structure  rather  than 
reasoning  structure.  (Note  that  all  of  the  subjects  will  be  novices.) 

15.  Comprehension  will  be  superior  if  the  focus  and  level  of  a  summary  correspond  to  the 
focus  and  level  of  the  structure  being  summarized. 

0.3  Hypotheses  at  the  Sentential  Level 

16.  Comprehension  will  be  assisted  if  the  strength  of  a  POP  marker  is  proportional  to  the 
size  of  the  movement  in  the  tree  that  it  accomplishes. 

17.  Comprehension  will  be  assisted  by  parallel  syntactic  placement  of  semantically  parallel 
information  units. 

18.  Comprehension  will  be  assisted  if  information  that  is  structurally  important  is  placed  in 
syntactically  heavy  constituents. 
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I.  Method 

This  appendix  brings  together  various  discussions  from  the  main  text  on  the  methodology  of 
Phase  1  of  the  project,  including  the  order  of  experimental  tasks,  selection  of  subjects,  and 
experimental  procedures  employed,  and  also  outlines  the  methodology  of  Phase  II. 

1.1  The  Order  of  Experimental  Tasks 

1.1.1  Phase  I 

The  first  series  of  project  activities  has  been  directed  toward  the  design  of  explanatory 
protocols  for  experimental  test  in  Phase  II. 

1.  A  circuit  diagnosis  problem  (see  below)  was  presented  to  individuals  whose  teaching 
should  benefit  from  the  proposed  improved  instructional  paradigms.  We  used  faculty  in 
engineering  and  electronics  at  a  local  community  college  as  instructors,  and  community 
college  students  with  no  background  in  engineering  or  mathematics  as  subjects.  The 
instructors  were  briefed  on  the  problem,  and  then  asked  to  present  material  to  prepare 
students  to  perform  the  indicated  task.  A  variety  of  briefings  were  tested  before  a 
suitable  one  was  determined.  The  first  presented  the  logic  box  as  a  piece  of  equipment 
coming  off  an  assembly  line  which  was  to  be  tested  by  the  student.  This  briefing  proved 
to  be  unsatisfactory  because  it  elicited  only  explanations  at  the  behavioral  level,  and 
could  not  be  used  to  elicit  any  of  the  more  complex  cognitive  levels.  The  second  form  of 
briefing  presented  the  box  as  the  control  device  of  a  set  of  sluices,  and  the  final  version 
elaborated  this  to  a  control  device  for  an  irrigation  system  providing  varying  mixtures  of 
fertilizer  and  water. 

2.  Five  instructors  were  used  as  subjects  in  six  experiments.  .All  experiments  were  recorded 
on  audio  tape  and  then  transcribed,  yielding  a  total  of  P24  pages  of  transcript  for  the 
instructor  briefing  and  subsequent  instructional  session.  (There  are  also  debriefings  for 
students  and/or  instructors  for  some  sessions.)  Each  such  session  lasted  between  one- 
half  and  one  hour.  The  first  two  experiments  used  the  same  instructor,  and  also  used 
groups  of  community  college  students  as  an  audience,  4  students  in  the  first  and  5  in  the 
second.  Instructors’  presentations  were  recorded  on  audio  tape. 

.3.  Following  elicitation  and  transcription,  we  then  analyzed  the  structure  of  the 
explanations  obtained  using  current  linguistic  theory.  This  required  studying  linguistic 
strategies  at  the  levels  of  the  sentence  and  the  discourse  unit,  and  also  studying  the 
effect  of  different  kinds  of  questions  asked  by  students  on  the  elicited  explanation 
structure.  This  analysis  makes  use  of  the  mathematical  theory  of  discourse  structure. 


This  analysis  was  used  to  identify  significant  variables  and  to  formulate  hypotheses 
about  relations  between  linguistic  form  and  task  performance.  These  hypotheses  are 
presented  in  Section  6. 

4.  The  last  step  in  Phase  I  was  the  preliminary  selection  of  the  most  interesting  hypotheses 
for  experimental  testing  in  Phase  II.  Selection  is  based  on  the  following  criteria: 
likelihood  that  the  hypothesis  will  have  significant  effects  on  learning;  possibility  of 
incorporating  and  varying  the  variable  of  interest  in  a  natural  discourse;  possibdity  of 
accurate  measurement  of  the  variable  of  interest;  and  possibility  of  holding  the  other 
variables  present  constant. 

1.1.2  Phase  II 

In  this  phase  of  the  project,  we  will  first  collect  and  analyze  video-taped  versions  of  our 
explanation  task,  since  the  analysis  of  the  audio-taped  experiments  of  Phase  I  indicated  that 
some  of  the  most  theoretically  interesting  and  practically  important  issues  of  multimedia 
instruction  can  only  be  studied  using  video  data.  We  will  then  refine  the  hypotheses  in  the 
light  of  this  data,  and  subject  the  most  promising  hypotheses  to  experimental  validation.  The 
tasks  of  this  phase  are  the  following: 

5.  We  will  perform  at  least  two  video-taped  sessions  of  the  instructional  task,  and  will 
analyze  the  forms  of  multimedia  instruction  using  the  theory  of  semiotic  morphisms 
already  developed. 

6.  Based  on  the  results  of  the  Phase  I,  and  task  4,  we  will  make  a  final  selection  of  the 
most  promising  hypotheses  for  testing. 

7.  To  test  these  hypotheses,  standard  variations  of  explanations  of  the  circuit  diagnosis 
problem  will  be  administered  to  groups  of  learners.  These  may  be  given  by  actors,  via 
videotape,  or  by  computer.  While  the  cell  design  depends  on  the  nature  and  number  of 
independent  explanation  variables  that  emerge  from  Phase  I,  enough  subjects  will  be 
tested  to  enable  statistical  generalization  (e.g.,  at  least  30  per  ceil  of  the  design). 

8.  If  the  results  suggest  it,  follow  up  trials  will  be  conducted  with  promising  combinations 
of  variables  or  setting  (e.g.,  with  vs.  without  interactive  discussion;  with  vs.  without 
diagrammatic  aids). 

9.  Dependent  measures  of  effect  (including  both  test  and  task  performance)  collected  from 
learners  will  be  examined  by  analysis  of  variance.  In  addition,  effects  will  be  assessed  to 
determine  the  contribution  (if  any)  of  exogeneous  variables  such  as  age,  education  level, 
and  previous  work  history  on  learners'  response  variables. 

10.  Learning  data  will  be  examined  primarily  by  means  of  analysis  of  variance  where 
alternative  explanatory  approaches  serve  as  independent  variables  and  test  (verbal)  and 
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task  (behavioral)  outcomes  serve  as  dependent  variables.  .Associations  among 
background  variables  and  dependent  measures  will  be  assessed  correlationally; 
exogeneous  variables  that  importantly  influence  outcomes  can  be  incorporated  into 
major  analyses  as  covariates.  Results  of  these  analyses  are  expected  to  contribute  both 
to  basic  understanding  of  the  effectiveness  of  alternative  explanation  approaches  and  to 
provide  a  foundation  for  recommendations  addressed  to  the  design  of  an  automatic 
explanation  generator,  and  to  the  improvement  of  instructional  discourse. 

1.2  Procedures 

The  task  for  both  the  elicited  explanations  and  the  learning  trials  makes  use  of  the  logic  box 
of  Figure  1.  This  box  has  two  switches  and  four  lights,  each  light  being  a  logical  function  of 
the  positions  of  the  two  switches.  In  Phase  I,  instructors  were  shown  the  box  and  how  it 
works;  they  were  then  requested,  in  a  nondirective  manner,  to  provide  an  explanation  of  how 
it  works  to  a  "typical'  group  of  students. 

In  the  second  part  of  Phase  II,  groups  of  students  will  be  presented  with  instructions  about 
how  the  logic  box  works.  Then  each  will  be  tested,  both  verbally  and  behaviorally,  for 
comprehension.  Phase  I  procedures  have  been  administered  to  subjects  (instructors) 
individually,  while  Phase  II  procedures  apply  to  groups.  Data  from  Phase  I  consists  of 
verbatim  protocols  for  linguistic  analysis,  while  data  from  Phase  II  will  consist  primarily  of 
standardized  test  and  task  scores  for  statistical  analysis. 

1.3  Subjects 

Subjects  for  Phase  I  procedures  were  community  college  instructors  who  arc  accustomed  to 
giving  explanations  of  circuit  logic.  Elicitation  continued  until  a  variety  of  patterns  had  been 
observed  and  replicated. 

Subjects  for  Phase  II  will  be  individuals  (male  and  female)  aged  about  17  to  25  who  are  not 
specially  trained  or  experienced  in  circuit  logic.  The  N  will  be  determined  by  the  cells  of  the 
design  for  testing  hypotheses  generated  in  Phase  I  efforts.  A  minimum  of  250  and  a  maximum 
of  550  subjects  are  anticipated. 
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II.  Grammar  of  the  Command  and  Control 
Speech  Act  Chain 

II.l  Categories  of  the  Grammar 

This  appendix  discusses  the  command  and  control  speech  act  chain,  a  specific  kind  of  speech 
act  chain  which  is  the  most  typical  discourse  type  for  aviation  discourse,  in  the  most  general 
sense,  these  rules  can  serve  as  an  example  of  how  aDy  speech  act  chain  can  be  analyzed.  More 
specifically,  we  have  found  that  the  command  and  control  speech  act  chains  characteristic  of 
aviation  discourse  are  formally  identical  to  the  speech  act  chains  of  instructional  discourse,  in 
the  sense  that  the  same  formal  grammar  describes  them. 

In  the  aviation  context,  operationally  relevant  speech  act  chains  typically  concern  possible 
actions  or  actions  that  have  already  been  performed.  According  to  the  usual  definition  [Searle 
60.  Searle  71],  speech  acts  can  also  be  seen  as  linguistic  acts  that  alter  the  perceived  state  of 
the  world.  This  subsection  presents  a  category  system  that  includes  both  linguistic  and 
physical  acts;  this  is  necessary  for  the  formal  description  of  the  command  and  control  speech 
act  chains. 

The  most  general  category  is  acts.  This  includes  physical  acts,  command  and  control 
speech  acts,  and  acknowledgements  of  such  speech  acts.  A  more  specific  category  is 
speech  acta,  the  basic  category  of  interest  for  command  and  control.  This  category  includes 

requests,  reports,  and  declarations. 

Additional  utterance  categories  of  interest  for  the  command  and  control  speech  act  chain  are 
plans  and  explanations,  which  may  be  embedded  within  a  command  and  control  speech  act 
chain.  (Plans  and  explanations  as  they  occur  in  the  command  and  control  speech  act  chain 
are  the  same  discourse  types  discussed  in  Section  2.1.) 

n.2  Subordination 

This  subsection  discusses  the  elements  used  to  construct  command  and  control  speech  act 
chains.  These  elements  are  of  two  types:  the  speech  acts  used  in  command  and  control;  and 
the  subordinators  that  indicate  the  relationships  among  them.  The  present  discussion  focusses 
on  how  these  categories  function  within  the  formal  grammar  of  command  and  control  speech 
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act  chains.  An  abbreviation  for  use  in  graphical  representations  is  given  for  each 
subordinator;  these  abbreviation  use  'square  brackets,'  i.e.,  (...]. 

f.  CHAIN:  This  node  type  is  the  top  level  subordinator  for  a  sequence  of  command  and 
control  speech  acts  having  the  same  major  propositional  content  and  constituting  a 
command  and  control  speech  act  chain.  This  node  therefore  marks  the  fact  that  a 
sequence  of  utterances  is  indeed  a  speech  act  chain;  it  is  not  usually  indicated  explicitly 
in  the  actual  sequence  of  utterances.  The  abbreviation  is  simply  [CHAIN], 

2.  REQUEST:  Requests  are  the  most  typical  command  and  control  speech  acts.  They 
include  questions,  commands  and  suggestions.  (A  command  can  be  viewed  as  a  request 
that  has  been  ratified  by  the  speaker  with  relevant  authority.)  In  the  formal  grammar,  a 
request  must  have  the  form  of  a  request  node  subordinating  a  single  subtree,  which  is  the 
act  that  is  requested.  (Searle’s  taxonomy  calls  these  'directives.')  The  abbreviation  is 
[REQ]. 

3.  REPORT:  A  report  is  an  indication  of  some  state  of  the  world.  The  abbreviation  is 
[REP].  In  the  formal  grammar,  reports  have  the  form  of  a  [REP]  node  subordinating  a 
single  subtree  giving  the  act  or  state  reported,  (lib)  is  an  example. 

(11a)  CAM-2  Ah.  what's  the  fuel  show  now  buddy? 

(lib)  CAM-3  Five 
(11c)  CAM-2  Five 

(1748:54) 

4.  ACKNOWLEDGE:  A  command  and  control  speech  act  (e.g.,  a  request  or  declaration) 
can  be  acknowledged;  but  challenges,  supports,  and  other  acknowledgements  cannot  be 
acknowledged.  (This  is  the  kind  of  constraint  on  sequencing  that  the  rules  below  are 
intended  to  capture.)  For  example,  (lib)  is  an  acknowledgement.  The  abbreviation  is 
[ACK].  An  [ACK]  node  indicates  the  subordination  of  an  acknowledgement  to  the 
speech  act  that  it  acknowledges. 

(12a)  C-l  You  gotta  keep  em  running.  Frostie 
(12b)  C-3  Yes.  sir 

(1808:42) 

Two  interesting  further  points  about  [ACK]  nodes  are:  (1)  the  speaker  of  an 
acknowledgement  must  be  among  the  addressees  of  the  request  or  report  that  it 
acknowledges;  and  (2)  more  than  one  addressee  may  produce  an  acknowledgement  of  the 
same  speech  act. 

5.  STATEMENT/REASON:  Subordinates  a  request  or  report  on  the  left,  and  a  reason 
supporting  it  on  the  right.  It  is  abbreviated  [ST/RSN].  It  may  also  occur  in  the 
opposite  order,  abbreviated  [RSN/STJ. 
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6  STATEMENT/CHALLENGE:  Subordinates  a  request  or  report  on  the  left,  and  a 
challenge  to  it  on  the  right.  It  is  abbreviated  [ST/CH).  It  may  also  occur  in  the 
opposite  order,  abbreviated  [CH/STj. 

7.  GOAL/PLAN:  Subordinates  a  goal  on  the  left,  and  a  plan  to  achieve  it  on  the  right. 
Abbreviated  simply  [GOAL/PLAN].  It  may  also  occur  in  the  opposite  order, 
abbreviated  [PLAN/GOAL]. 

H.3  Rules 

This  subsection  gives  the  rules  of  the  grammar  for  command  and  control  speech  act  chains  in 
simple  English,  and  also  in  a  graphical  form  in  Figure  II- 1.  This  grammar  expresses  how  these 
speech  act  chains  are  constructed  in  real  time.  It  thus  defines  the  sequences  of  operationally 
relevant  speech  acts  that  are  possible  in  command  and  control  discourse,  and  indicates  some 
(but  not  all)  of  the  sequences  that  are  not  possible.  It  should  be  noted  that  this  is  a  grammar 
of  social  force  rather  than  of  linguistic  form;  that  is,  the  rules  apply  to  the  social 
interpretations  of  utterances,  rather  than  to  the  utterances  themselves,  or  to  the  sequences  of 
words  or  sentences  that  comprise  them. 

In  this  grammar,  nodes  that  must  subordinate  other  nodes  have  'square  brackets,'  e.g., 
[ACK],  and  nodes  that  indicate  categories  that  will  later  be  filled  have  'pointed  brackets.' 
e.g.,  <REPORT>.  The  first  two  rules  simply  define  subcategories  of  given  categories.  They 
are 

1.  A  command  and  control  speech  act,  abbreviated  <SPACT>,  may  be  a  request,  a 
report,  or  a  declaration,  abbreviated  <REQ>,  <REPORT>  and  <DECL> 
respectively. 

2.  An  act,  abbreviated  <ACT>,  may  be  a  <SPACT>,  an  acknowledgement,  or  a 
physical  act,  abbreviated  <ACK>  and  <PHACT>  respectively. 

The  basic  entity  being  formalized,  the  command  and  control  speech  act  chain,  is  indicated  by 
*  (CHAIN)  node;  all  the  speech  acts  that  constitute  a  given  chain  will  be  subordinated  to  one 
such  node.  The  beginning  of  the  production  of  a  command  and  control  speech  act  chain  is  a 
single  [CHAIN]  node  with  two  subordinate  <SPACT>  nodes;  the  fact  that  there  are  two 
such  nodes  expresses  the  fact  that  there  must  be  at  least  two  speech  acts  in  a  command  and 
control  speech  act  chain.  The  basic  rule  of  development  for  command  and  control  speech  act 
chains  is  simply: 
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3.  A  [CHAIN]  node  with  n  descendent  nodes  can  be  elaborated  into  a  [CRAIN]  node  with 
n+1  descendents.  This  expresses  the  fact  that  a  command  and  control  speech  act  chain 
may  be  of  any  length;  that  is,  it  may  contain  any  number  of  speech  acts. 

The  next  two  rules  are  basically  parallel;  they  indicate  how  <REQ>  and  <REPORT> 

nodes  can  be  elaborated: 

4.  A  <REQ>  node  can  be  expanded  into  a  [REQ]  node  subordinating  an  <ACT>  node. 
This  means  that  any  request  is  a  request  for  an  action,  either  a  physical  action  or  a 
speech  act. 

5.  A  <REPORT>  node  can  be  expanded  into  a  [REPORT]  node  subordinating  an 
<ACT>  node.  This  means  that  any  report  is  a  report  of  an  action,  either  a  physical  or 
a  speech  act  or  of  a  state  of  the  world. 

Next  is  a  set  of  three  rules  that  may  be  applied  to  any  node  [XX]  that  is  either  a  [REQ]  or  a 

[REPORT]  node  subordinating  an  arbitrary  subtree: 

6.  .An  [XX]  node  subordinating  a  subtree  may  be  replaced  by  an  [ACK]  node  subordinating 
[XX]  with  its  subtree  on  the  left,  and  an  <ACK>  node  on  the  right.  This  means  that 
any  report  or  request  may  be  acknowledged. 

7.  .An  [XX]  node  subordinating  a  subtree  may  be  replaced  by  either:  a  [ST/RS.N]  node 
subordinating  the  [XX]  node  with  its  subtree  on  the  left,  and  subordinating  an 
<EXPL>  node  on  the  right;  or  a  [RSN/ST]  node  with  the  same  subordinate  subtrees  in 
the  opposite  order.  This  rule  means  that  any  report  or  request  may  be  supported  by- 
giving  a  reason  (RSN),  having  the  formal  structure  of  an  explanation. 

8.  An  [XX]  node  subordinating  a  subtree  may  be  replaced  by  either:  a  [ST/CH]  node 
subordinating  the  [XX]  node  with  its  subtree  on  the  left,  and  an  <EXPL>  node  on  the 
right:  or  else  a  [CH/ST]  node  with  the  same  subordinates  in  the  opposite  order  This 
rule  means  that  any  report  or  request  may  be  challenged  by  a  speaker  giving  an 
explanation  of  why  it  is  a  bad  idea. 

The  final  rule  permits  the  introduction  of  planning. 

9.  A  [REQ]  node  subordinating  an  arbitrary  subtree  may  be  replaced  by  a  [GO.AL/PLA.N] 
node  subordinating  the  [REQ]  node  with  its  subtree  on  the  right,  and  a  <PLAN>  node 
on  the  left.  This  means  that  any  request  may  be  incorporated  as  part  of  a  plan:  that  is. 
the  simple  process  of  requesting  an  act  can  be  elaborated  into  the  process  of  planning. 

These  rules  are  all  given  graphically  in  Figure  II-l;  graphical  indications  of  focus  of  attention 


■V-i'  A.  •  TT.il  .u 


ol 


are  also  given  there.  An  extended  example  is  given  in  the  following  subsection,  illustrating 
how  these  rules  are  used  to  analyze  an  actual  command  and  control  speech  act  chain. 


Figure  II- 1:  Graphical  Presentation  of  Command  and  Control  Rules 


.  .  ..  4; 
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H.4  An  Example  of  a  Command  and  Control  Speech  Act  Chain 

The  purpose  of  the  preceding  discussion  has  been  to  describe  some  constraints  on  command 
and  control  speech  act  chains,  and  in  particular,  to  indicate  some  possible  and  impossible 
embeddings  of  social  force.  That  is,  we  have  attempted  to  specify  what  sequences  of  speech 
acts  form  command  and  cont'ol  speech  act  chains,  and  what  sequences  do  not.  For  example, 
an  acknowledgement  of  a  support  of  a  request  for  an  act  A  should  not  occur,  although  an 
acknowledgement  of  a  request  for  an  act  A  and  a  request  for  a  support  of  a  request  for  an  act 
A  may  occur. 

To  illustrate  this  kind  of  sequencing,  let  us  consider  the  data  in  example  (13): 


(13a) 

CAM-1 

Hey  Frostia 

(13b) 

CAM-3 

Yes  sir 

(13c) 

CAM-1 

Give  us  a  current  card  on  weight  figure  about  another 

fifteen  minutes 

(13d) 

CAM-3 

Fifteen  minutes? 

(13e) 

CAM-1 

Yeah  give  us  three  or  four  thousand  pounds  on 
of  zero  fuel  weight 

top 

(13f) 

CAM-3 

Not  enough 

(13g) 

CAM-3 

Fifteen  minutes  is  gonna  really  run  us  low  on 

fuel  here 

(13h) 

CAM-? 

Right 

(1750) 

First  of  all,  (13a)  and  (13b)  form  what  is  termed  a  •call-response*  pair,  that  is.  a  call  for 
attention  followed  by  an  acknowledgement  that  the  addressee  is  attending.  Using  the 
concepts  of  this  study,  this  can  be  seen  as  a  request  having  empty  propositional  content, 
followed  by  an  acknowledgement;  it  cannot  be  seen  as  a  command  and  control  speech  act 
chain,  because  chains  must  have  more  than  one  subordinate  node.  Thus  the  pair  (13a-b)  is 
indicated  as  shown  in  Figure  II-l,  where  0  indicates  empty  propositional  content.  Adding 
(13c-d)  to  this  yields  the  tree  shown  in  Figure  II-3,  where  c  denotes  the  propositional  content 
of  (13c)  and  d  that  of  (13d). 

(13e)  refines  this  propositional  content  to  say  that  there  will  be  three  or  four  thousand  pounds 
in  fifteen  minutes,  denoted  here  as  e.  This  is  followed  by  an  unusually  strong  challenge  in 
(13f),  the  propositional  content  of  which,  Mot  enough,  is  indicated  by  f  in  Figure  II-4.  Rather 
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[ACK] 

/\ 

0  0 

Figure  II-2:  A  Call-Response  Pair 

[CHAIN] 

/  \ 

[ACK]  [ST/CH] 

/\  /  \ 

0  0c  d 

Figure  H-3:  A  Challenge 

than  repeating  the  two  subtrees  of  Figure  H-3,  we  here  denote  them  as  tl  and  t2, 
respectively. 

[CHAIN] 

/  I  \ 

tl  t2  [ST/CH] 

/  \ 

[REft]  f 


Figure  11-4:  A  Further  Challenge 


Finally,  (I3g)  is  a  supporting  explanation  of  (13f),  and  (13h)  is  a  support  of  (13g),  and  thus  of 
(13f).  Thus,  the  social  force  of  this  whole  sequence  could  be  notated  as  in  Figure  II-5.  where  g 
is  the  propositional  content  of  (13g)  and  h  that  of  (13h). 


[CHAIN] 

/  I  \ 

tl  t2  [ST/CH] 

/  \ 

[REft]  [ST/RSN] 

I  /\ 

e  [ST/RSN]  h 

/  \ 

*  g 

Figure  U-S:  A  Complete  Command  and  Control  Speech  Act  Chain 
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The  project  has  two  phases.  The  first  phase  elicits  experienced  instructors’ 
explanations  of  a  demonstration  device,  in  order  to  obtain  for  analysis  a 
significant  range  of  the  possible  discourse  structures  that  occur  in  instruc¬ 
tion.  The  outcome  of  this  phse  is  a  set  of  variables,  and  a  set  of  hypotheses 
about  relationships  among  them  that  lead  to  effective  instruction.  The  second 
phase  will  test  these  hypotheses  on  groups  of  students. 

(Continued,  reverse  side) 


DO  ,  ESw  M73  •  HOV .. ..  obsolete  Unclassified 

IECURITV  CLASSIFICATION  OF  THIS  PAOE  fWkon  Oaf*  Bnltrtd) 


EDITION  OF  1  NOV  SS  IS  OBSOLETE 


-SLV 


v  -v, k ; 


